Skip to left side bar
>
  • File
  • Edit
  • View
  • Run
  • Kernel
  • Tabs
  • Settings
  • Help

Open Tabs

  • 085-assignment.ipynb
  • 082-test-driven.ipynb
  • 083-garch.ipynb
  • 084-model-deployment.ipynb

Kernels

  • 085-assignment.ipynb
  • 081-working-with-apis.ipynb
  • 082-test-driven.ipynb
  • 083-garch.ipynb

Terminals

  • terminals/1
//ds-curriculum/080-volatility-forecasting-in-india/
Name
...
Last Modified
  • .ipynb_checkpoints16 hours ago
  • images36 minutes ago
  • models13 minutes ago
  • .env2 days ago
  • 081-working-with-apis.ipynb2 days ago
  • 082-test-driven.ipynb2 days ago
  • 083-garch.ipynb18 hours ago
  • 084-model-deployment.ipynb16 hours ago
  • 085-assignment.ipynb11 minutes ago
  • config.py2 months ago
  • data.py2 days ago
  • main.py25 minutes ago
  • model.py33 minutes ago
  • stocks.sqlite29 minutes ago
085-assignment.ipynb
  • Working with APIs
  • Test-Driven Development
  • Predicting Volatility
  • Prepare Data
  • Build Model
  • Model Deployment
​x

Usage Guidelines

This lesson is part of the DS Lab core curriculum. For that reason, this notebook can only be used on your WQU virtual machine.

This means:

  • ⓧ No downloading this notebook.
  • ⓧ No re-sharing of this notebook with friends or colleagues.
  • ⓧ No downloading the embedded videos in this notebook.
  • ⓧ No re-sharing embedded videos with friends or colleagues.
  • ⓧ No adding this notebook to public or private repositories.
  • ⓧ No uploading this notebook (or screenshots of it) to other websites, including websites for study resources.

8.5 Volatility Forecasting in South Africa 🇿🇦

In this assignment you'll build a model to predict stock volatility for the telecommunications company MTN Group.

Tip: There are some tasks in this assignment that you can complete by importing functions and classes you created for your app. Give it a try!

Warning: There are some tasks in this assignment where there is an extra code block that will transform your work into a submission that's compatible with the grader. Be sure to run those cells and inspect the submission before you submit to the grader.

[59]:
xxxxxxxxxx
 
%load_ext autoreload
%autoreload 2
​
import wqet_grader
from arch.univariate.base import ARCHModelResult
​
wqet_grader.init("Project 8 Assessment")
​
The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
[54]:
 
# Import your libraries here
import pandas as pd
import numpy as np
import requests
import sqlite3
import matplotlib.pyplot as plt
from arch import arch_model
from config import settings
from data import SQLRepository
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
​

Working with APIs¶

Task 8.5.1: Create a URL to get all the stock data for MTN Group ("MTNOY") from AlphaVantage in JSON format. Be sure to use the https://learn-api.wqu.edu hostname. And don't worry: your submission won't include your API key!

[12]:
 
ticker = "MTNOY"
output_size = "full"
data_type = "json"
​
url = (
    "https://learn-api.wqu.edu/1/data-services/alpha-vantage/query?"
    "function=TIME_SERIES_DAILY&"
    f"symbol={ticker}&"
    f"outputsize={output_size}&"
    f"datatype={data_type}&"
    f"apikey={settings.alpha_api_key}"
)
​
print("url type:", type(url))
url
url type: <class 'str'>
[12]:
'https://learn-api.wqu.edu/1/data-services/alpha-vantage/query?function=TIME_SERIES_DAILY&symbol=MTNOY&outputsize=full&datatype=json&apikey=24a111279c8d706a7407de7aff7158b94eaa5cdc1a7829cc066815322162aa712b23f56fc92f2111487a934bc281ca4ed1fb1e40ce66224c5a03e0aa32d3198792624ed0898105426b0b7e2676c56dc45254dd5b9d5f2be2bb6bfde2e057b0cd5111055fa92a5402206ad4a2847164cba13ed8561e609645c17ffe5b05f97d19'
[13]:
 
# Remove API key for submission
submission_851 = url[:170]
submission_851
[13]:
'https://learn-api.wqu.edu/1/data-services/alpha-vantage/query?function=TIME_SERIES_DAILY&symbol=MTNOY&outputsize=full&datatype=json&apikey=24a111279c8d706a7407de7aff7158b'
[14]:
 
wqet_grader.grade("Project 8 Assessment", "Task 8.5.1", submission_851)

Python master 😁

Score: 1

Task 8.5.2: Create an HTTP request for the URL you created in the previous task. The grader will evaluate your work by looking at the ticker symbol in the "Meta Data" key-value pair in your response.

[100]:
xxxxxxxxxx
 
response = requests.get(url)
​
print("response type:", type(response))
response type: <class 'requests.models.Response'>
[19]:
 
# Get symbol in `"Meta Data"`
submission_852 = response.json()["Meta Data"]["2. Symbol"]
submission_852
[19]:
'MTNOY'
[20]:
 
wqet_grader.grade("Project 8 Assessment", "Task 8.5.2", submission_852)

Wow, you're making great progress.

Score: 1

Task 8.5.3: Get status code of your response and assign it to the variable response_code.

[27]:
 
response_code = response.status_code
​
print("code type:", type(response_code))
response_code
code type: <class 'int'>
[27]:
200
[28]:
 
wqet_grader.grade("Project 8 Assessment", "Task 8.5.3", response_code)

Wow, you're making great progress.

Score: 1

Test-Driven Development¶

Task 8.5.4: Create a DataFrame df_mtnoy with all the stock data for MTN. Make sure that the DataFrame has the correct type of index and column names. The grader will evaluate your work by looking at the row in df_mtnoy for 6 December 2021.

[36]:
xxxxxxxxxx
 
response_data = response.json()
stock_data = response_data["Time Series (Daily)"]
​
df_mtnoy = pd.DataFrame.from_dict(stock_data,orient="index",dtype = "float")
df_mtnoy.index= pd.to_datetime(df_mtnoy.index)
df_mtnoy.index.name = "date"
df_mtnoy.columns = [c.split(". ")[1] for c in df_mtnoy.columns]
print("df_mtnoy type:", type(df_mtnoy))
df_mtnoy.head()
df_mtnoy type: <class 'pandas.core.frame.DataFrame'>
[36]:
open high low close volume
date
2023-01-26 8.255 8.2950 8.248 8.295 3462.0
2023-01-25 8.100 8.1425 8.075 8.075 28029.0
2023-01-24 8.030 8.0850 8.020 8.020 7391.0
2023-01-23 7.890 8.0265 7.890 7.980 16090.0
2023-01-20 7.810 7.9300 7.810 7.930 10861.0
[37]:
xxxxxxxxxx
 
# Get row for 6 Dec 2021
submission_854 = df_mtnoy.loc["2021-12-06"].to_frame().T
submission_854
[37]:
open high low close volume
2021-12-06 10.16 10.18 10.11 10.11 13542.0
[38]:
xxxxxxxxxx
 
wqet_grader.grade("Project 8 Assessment", "Task 8.5.4", submission_854)
​

Way to go!

Score: 1

Task 8.5.5: Connect to the database whose name is stored in the .env file for this project. Be sure to set the check_same_thread argument to False. Assign the connection to the variable connection. The grader will evaluate your work by looking at the database location assigned to connection.

[96]:
xxxxxxxxxx
 
connection = sqlite3.connect(database= settings.db_name, check_same_thread = False)
connection
[96]:
<sqlite3.Connection at 0x7f764e6457b0>
[40]:
xxxxxxxxxx
 
# Get location of database for `connection`
submission_855 = connection.cursor().execute("PRAGMA database_list;").fetchall()[0][-1]
submission_855
[40]:
'/home/jovyan/work/ds-curriculum/080-volatility-forecasting-in-india/stocks.sqlite'
[41]:
xxxxxxxxxx
 
wqet_grader.grade("Project 8 Assessment", "Task 8.5.5", submission_855)

Very impressive.

Score: 1

Task 8.5.6: Insert df_mtnoy into your database. The grader will evaluate your work by looking at the first five rows of the MTNOY table in the database.

[42]:
xxxxxxxxxx
 
# Insert `MTNOY` data into database
repo = SQLRepository(connection =connection)
repo.insert_table(table_name="MTNOY", records=df_mtnoy, if_exists="replace")
[42]:
{'transaction_successful': True, 'records_inserted': 3913}
[43]:
xxxxxxxxxx
 
# Get first five rows of `MTNOY` table
submission_856 = pd.read_sql(sql="SELECT * FROM MTNOY LIMIT 5", con=connection)
submission_856
[43]:
date open high low close volume
0 2023-01-26 00:00:00 8.255 8.2950 8.248 8.295 3462.0
1 2023-01-25 00:00:00 8.100 8.1425 8.075 8.075 28029.0
2 2023-01-24 00:00:00 8.030 8.0850 8.020 8.020 7391.0
3 2023-01-23 00:00:00 7.890 8.0265 7.890 7.980 16090.0
4 2023-01-20 00:00:00 7.810 7.9300 7.810 7.930 10861.0
[44]:
xxxxxxxxxx
 
wqet_grader.grade("Project 8 Assessment", "Task 8.5.6", submission_856)
​

Party time! 🎉🎉🎉

Score: 1

Task 8.5.7: Read the MTNOY table from your database and assign the output to df_mtnoy_read. The grader will evaluate your work by looking at the row for 27 April 2022.

[45]:
xxxxxxxxxx
 
df_mtnoy_read = repo.read_table(table_name="MTNOY")
​
print("df_mtnoy_read type:", type(df_mtnoy_read))
print("df_mtnoy_read shape:", df_mtnoy_read.shape)
df_mtnoy_read.head()
df_mtnoy_read type: <class 'pandas.core.frame.DataFrame'>
df_mtnoy_read shape: (3913, 5)
[45]:
open high low close volume
date
2023-01-26 8.255 8.2950 8.248 8.295 3462.0
2023-01-25 8.100 8.1425 8.075 8.075 28029.0
2023-01-24 8.030 8.0850 8.020 8.020 7391.0
2023-01-23 7.890 8.0265 7.890 7.980 16090.0
2023-01-20 7.810 7.9300 7.810 7.930 10861.0
[46]:
xxxxxxxxxx
 
# Get row for 27 April 2022
submission_857 = df_mtnoy_read.loc["2022-04-27"].to_frame().T
submission_857
[46]:
open high low close volume
2022-04-27 10.71 10.85 10.5 10.65 23927.0
[47]:
xxxxxxxxxx
 
wqet_grader.grade("Project 8 Assessment", "Task 8.5.7", submission_857)
​

Excellent! Keep going.

Score: 1

Predicting Volatility¶

Prepare Data¶

Task 8.5.8: Create a Series y_mtnoy with the 2,500 most recent returns for MTN. The grader will evaluate your work by looking at the volatility for 9 August 2022.

[48]:
xxxxxxxxxx
 
df = repo.read_table(table_name = "MTNOY", limit=2500+1)
df.sort_index(inplace =True)
​
df["return"] = df["close"].pct_change() * 100
y_mtnoy = df["return"].dropna()
​
print("y_mtnoy type:", type(y_mtnoy))
print("y_mtnoy shape:", y_mtnoy.shape)
y_mtnoy.head()
y_mtnoy type: <class 'pandas.core.series.Series'>
y_mtnoy shape: (2500,)
[48]:
date
2013-02-22   -0.970874
2013-02-25   -1.176471
2013-02-26    0.694444
2013-02-27   -0.492611
2013-02-28   -2.722772
Name: return, dtype: float64
[49]:
xxxxxxxxxx
 
# Get data for 8 Aug 2022
submission_859 = float(y_mtnoy["2022-08-09"])
submission_859
[49]:
1.5783540022547893
[50]:
xxxxxxxxxx
 
wqet_grader.grade("Project 8 Assessment", "Task 8.5.8", submission_859)

Yes! Your hard work is paying off.

Score: 1

Task 8.5.9: Calculate daily volatility for y_mtnoy, and assign the result to mtnoy_daily_volatility.

[51]:
xxxxxxxxxx
 
mtnoy_daily_volatility = y_mtnoy.std()
​
print("mtnoy_daily_volatility type:", type(mtnoy_daily_volatility))
print("MTN Daily Volatility:", mtnoy_daily_volatility)
mtnoy_daily_volatility type: <class 'float'>
MTN Daily Volatility: 2.914284074738285
[52]:
xxxxxxxxxx
 
wqet_grader.grade("Project 8 Assessment", "Task 8.5.9", mtnoy_daily_volatility)

Wow, you're making great progress.

Score: 1

Task 8.5.10: Calculate the annual volatility for y_mtnoy, and assign the result to mtnoy_annual_volatility.

[55]:
xxxxxxxxxx
 
mtnoy_annual_volatility = mtnoy_daily_volatility * np.sqrt(252)
​
print("mtnoy_annual_volatility type:", type(mtnoy_annual_volatility))
print("MTN Annual Volatility:", mtnoy_annual_volatility)
mtnoy_annual_volatility type: <class 'numpy.float64'>
MTN Annual Volatility: 46.26282546932085
[56]:
xxxxxxxxxx
 
wqet_grader.grade("Project 8 Assessment", "Task 8.5.10", float(mtnoy_annual_volatility))

You = coding 🥷

Score: 1

Task 8.5.11: Create a time series line plot for y_mtnoy. Be sure to label the x-axis "Date", the y-axis "Returns", and use the title "Time Series of MTNOY Returns".

[57]:
xxxxxxxxxx
 
# Create `fig` and `ax`
fig, ax = plt.subplots(figsize=(15, 6))
​
# Plot `y_mtnoy` on `ax`
y_mtnoy.plot(ax=ax)
​
# Add axis labels
plt.xlabel("Date")
plt.ylabel("Returns")
# Add title
plt.title("Time Series of MTNOY Returns")
​
# Don't delete the code below 👇
plt.savefig("images/8-5-11.png", dpi=150)
​
[58]:
xxxxxxxxxx
 
with open("images/8-5-11.png", "rb") as file:
    wqet_grader.grade("Project 8 Assessment", "Task 8.5.11", file)

Awesome work.

Score: 1

Task 8.5.12: Create an ACF plot of the squared returns for MTN. Be sure to label the x-axis "Lag [days]", the y-axis "Correlation Coefficient", and use the title "ACF of MTNOY Squared Returns".

[60]:
x
# Create `fig` and `ax`
fig, ax = plt.subplots(figsize=(15, 6))
​
# Create ACF of squared returns
plot_acf(y_mtnoy **2, ax=ax);
​
# Add axis labels
plt.xlabel("Lag [days]")
plt.ylabel("Correlation Coefficient")
​
​
# Add title
plt.title("ACF of MTNOY Squared Returns")
​
# Don't delete the code below 👇
plt.savefig("images/8-5-12.png", dpi=150)
​
[61]:
xxxxxxxxxx
 
with open("images/8-5-12.png", "rb") as file:
    wqet_grader.grade("Project 8 Assessment", "Task 8.5.12", file)

Yes! Great problem solving.

Score: 1

Task 8.5.13: Create a PACF plot of the squared returns for MTN. Be sure to label the x-axis "Lag [days]", the y-axis "Correlation Coefficient", and use the title "PACF of MTNOY Squared Returns".

[63]:
x
# Create `fig` and `ax`
fig, ax = plt.subplots(figsize=(15, 6))
​
# CreateP ACF of squared returns
plot_pacf(y_mtnoy **2, ax=ax);
​
# Add axis labels
plt.xlabel("Lag [days]")
plt.ylabel("Correlation Coefficient")
​
​
# Add title
plt.title("PACF of MTNOY Squared Returns")
​
# Don't delete the code below 👇
plt.savefig("images/8-5-13.png", dpi=150)
​
[64]:
xxxxxxxxxx
 
with open("images/8-5-13.png", "rb") as file:
    wqet_grader.grade("Project 8 Assessment", "Task 8.5.13", file)

You're making this look easy. 😉

Score: 1

Task 8.5.14: Create a training set y_mtnoy_train that contains the first 80% of the observations in y_mtnoy.

[67]:
xxxxxxxxxx
 
cutoff_test = int(len(y_mtnoy)*0.8)
y_mtnoy_train = y_mtnoy.iloc[:cutoff_test]
​
print("y_mtnoy_train type:", type(y_mtnoy_train))
print("y_mtnoy_train shape:", y_mtnoy_train.shape)
y_mtnoy_train.head()
y_mtnoy_train type: <class 'pandas.core.series.Series'>
y_mtnoy_train shape: (2000,)
[67]:
date
2013-02-22   -0.970874
2013-02-25   -1.176471
2013-02-26    0.694444
2013-02-27   -0.492611
2013-02-28   -2.722772
Name: return, dtype: float64
[68]:
xxxxxxxxxx
 
wqet_grader.grade("Project 8 Assessment", "Task 8.5.14", y_mtnoy_train)

You got it. Dance party time! 🕺💃🕺💃

Score: 1

Build Model¶

Task 8.5.15: Build and fit a GARCH model using the data in y_mtnoy. Try different values for p and q, using the summary to assess its performance. The grader will evaluate whether your model is the correct data type.

[70]:
xxxxxxxxxx
 
# Build and train model
model = arch_model(
    y_mtnoy_train,
    p=1,
    q=1,
    rescale=False
).fit(disp=0)
​
print("model type:", type(model))
​
# Show model summary
model.summary()
model type: <class 'arch.univariate.base.ARCHModelResult'>
[70]:
Constant Mean - GARCH Model Results
Dep. Variable: return R-squared: 0.000
Mean Model: Constant Mean Adj. R-squared: 0.000
Vol Model: GARCH Log-Likelihood: -4737.24
Distribution: Normal AIC: 9482.48
Method: Maximum Likelihood BIC: 9504.88
No. Observations: 2000
Date: Fri, Jan 27 2023 Df Residuals: 1999
Time: 05:34:29 Df Model: 1
Mean Model
coef std err t P>|t| 95.0% Conf. Int.
mu -0.0181 5.440e-02 -0.333 0.739 [ -0.125,8.852e-02]
Volatility Model
coef std err t P>|t| 95.0% Conf. Int.
omega 0.1290 5.796e-02 2.226 2.603e-02 [1.540e-02, 0.243]
alpha[1] 0.0740 1.754e-02 4.221 2.431e-05 [3.965e-02, 0.108]
beta[1] 0.9124 1.925e-02 47.394 0.000 [ 0.875, 0.950]


Covariance estimator: robust
[71]:
xxxxxxxxxx
 
submission_8515 = isinstance(model, ARCHModelResult)
submission_8515
[71]:
True
[72]:
xxxxxxxxxx
 
wqet_grader.grade("Project 8 Assessment", "Task 8.5.15", [submission_8515])

Yes! Your hard work is paying off.

Score: 1

Task 8.5.16: Plot the standardized residuals for your model. Be sure to label the x-axis "Date", the y-axis "Value", and use the title "MTNOY GARCH Model Standardized Residuals".

[73]:
x
 
# Create `fig` and `ax`
fig, ax = plt.subplots(figsize=(15, 6))
​
# Plot standardized residuals
model.std_resid.plot(ax=ax)
# Add axis labels
plt.xlabel("Date")
plt.ylabel("Value")
​
# Add title
plt.title("MTNOY GARCH Model Standardized Residuals")
​
# Don't delete the code below 👇
plt.savefig("images/8-5-16.png", dpi=150)
​
[74]:
xxxxxxxxxx
 
with open("images/8-5-16.png", "rb") as file:
    wqet_grader.grade("Project 8 Assessment", "Task 8.5.16", file)

Awesome work.

Score: 1

Task 8.5.17: Create an ACF plot of the squared, standardized residuals of your model. Be sure to label the x-axis "Lag [days]", the y-axis "Correlation Coefficient", and use the title "ACF of MTNOY GARCH Model Standardized Residuals".

[75]:
xxxxxxxxxx
 
# Create `fig` and `ax`
fig, ax = plt.subplots(figsize=(15, 6))
​
# CreateP ACF of squared returns
plot_pacf(model.std_resid**2, ax=ax);
​
# Add axis labels
plt.xlabel("Lag [days]")
plt.ylabel("Correlation Coefficient")
​
​
# Add title
plt.title("PACF of MTNOY GARCH Model Standardized Residuals")
​
​
# Don't delete the code below 👇
plt.savefig("images/8-5-17.png", dpi=150)
​
[76]:
xxxxxxxxxx
 
with open("images/8-5-17.png", "rb") as file:
    wqet_grader.grade("Project 8 Assessment", "Task 8.5.17", file)

You = coding 🥷

Score: 1

Model Deployment¶

Ungraded Task: If it's not already running, start your app server. WQU WorldQuant University Applied Data Science Lab QQQQ

Task 8.5.18: Change the fit method of your GarchModel class so that, when a model is done training, two more attributes are added to the object: self.aic with the AIC for the model, and self.bic with the BIC for the model. When you're done, use the cell below to check your work.

Tip: How can you access the AIC and BIC scores programmatically? Every ARCHModelResult has an .aic and a .bic attribute.

[85]:
xxxxxxxxxx
 
# Import `build_model` function
from main import build_model
​
# Build model using new `MTNOY` data
model = build_model(ticker="MTNOY", use_new_data=True)
​
# Wrangle `MTNOY` returns
model.wrangle_data(n_observations=2500)
​
# Fit GARCH(1,1) model to data
model.fit(p=1, q=1)
​
# Does model have AIC and BIC attributes?
assert hasattr(model, "aic")
assert hasattr(model, "bic")
[86]:
xxxxxxxxxx
 
# Put test results into dictionary
submission_8518 = {"has_aic": hasattr(model, "aic"), "has_bic": hasattr(model, "bic")}
submission_8518
[86]:
{'has_aic': True, 'has_bic': True}
[87]:
xxxxxxxxxx
 
wqet_grader.grade("Project 8 Assessment", "Task 8.5.18", submission_8518)

Yes! Great problem solving.

Score: 1

Task 8.5.19: Change the fit_model function in the main module so that the "message" it returns includes the AIC and BIC scores. For example, the message should look something like this:

"Trained and saved 'models/2022-10-12T23:10:06.577238_MTNOY.pkl'. Metrics: AIC 9892.184665169907, BIC 9914.588275008075."

When you're done, use the cell below to check your work.

[91]:
xxxxxxxxxx
 
# Import `FitIn` class and `fit_model` function
from main import FitIn, fit_model
​
# Instantiate `FitIn` object
request = FitIn(ticker="MTNOY", use_new_data=False, n_observations=2500, p=1, q=1)
​
# Build model and fit to data, following parameters in `request`
fit_out = fit_model(request=request)
​
# Inspect `fit_out`
fit_out
[91]:
{'ticker': 'MTNOY',
 'p': 1,
 'q': 1,
 'n_observations': 2500,
 'use_new_data': False,
 'success': True,
 'message': 'Trained and save models/2023-01-27T05:48:09.972264_MTNOY.pkl. Metrics: AIC 12006.325788440472, BIC 12029.621972483897.'}
[92]:
xxxxxxxxxx
 
wqet_grader.grade("Project 8 Assessment", "Task 8.5.19", fit_out)

You got it. Dance party time! 🕺💃🕺💃

Score: 1

Task 8.5.20: Create a post request to hit the "/fit" path running at "http://localhost:8008". You should train a GARCH(1,1) model on 2500 observations of the MTN data you already downloaded. Pass in your parameters as a dictionary using the json argument. The grader will evaluate the JSON of your response.

[101]:
x
 
# URL of `/fit` path
url = "http://localhost:8008/fit"
# Data to send to path
json = {
    "ticker":"MTNOY",
    "use_new_data":False,
    "n_observations":2500,
    "p":1,
    "q":1
}
# Response of post request
response = requests.post(url=url, json=json)
​
print("response type:", type(response))
print("response status code:", response.status_code)
response type: <class 'requests.models.Response'>
response status code: 200
[102]:
xxxxxxxxxx
 
submission_8520 = response.json()
submission_8520
[102]:
{'ticker': 'MTNOY',
 'p': 1,
 'q': 1,
 'n_observations': 2500,
 'use_new_data': False,
 'success': True,
 'message': 'Trained and save models/2023-01-27T06:00:24.703316_MTNOY.pkl. Metrics: AIC 12006.325788440472, BIC 12029.621972483897.'}
[104]:
xxxxxxxxxx
 
wqet_grader.grade("Project 8 Assessment", "Task 8.5.20", submission_8520)
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
Cell In [104], line 1
----> 1 wqet_grader.grade("Project 8 Assessment", "Task 8.5.20", submission_8520)

File /opt/conda/lib/python3.9/site-packages/wqet_grader/__init__.py:182, in grade(assessment_id, question_id, submission)
    177 def grade(assessment_id, question_id, submission):
    178   submission_object = {
    179     'type': 'simple',
    180     'argument': [submission]
    181   }
--> 182   return show_score(grade_submission(assessment_id, question_id, submission_object))

File /opt/conda/lib/python3.9/site-packages/wqet_grader/transport.py:146, in grade_submission(assessment_id, question_id, submission_object)
    144     raise Exception('Grader raised error: {}'.format(error['message']))
    145   else:
--> 146     raise Exception('Could not grade submission: {}'.format(error['message']))
    147 result = envelope['data']['result']
    149 # Used only in testing

Exception: Could not grade submission: Could not verify access to this assessment: Received error from WQET submission API: Could not find existing program enrollment for user

Task 8.5.21: Create a post request to hit the "/predict" path running at "http://localhost:8008". You should get the 5-day volatility forecast for MTN. When you're satisfied, submit your work to the grader.

[ ]:
xxxxxxxxxx
 
# URL of `/predict` path
url = ...
# Data to send to path
json = ...
# Response of post request
response = ...
​
print("response type:", type(response))
print("response status code:", response.status_code)
[ ]:
xxxxxxxxxx
 
submission_8521 = response.json()
submission_8521
[ ]:
xxxxxxxxxx
 
wqet_grader.grade("Project 8 Assessment", "Task 8.5.21", submission_8521)

Copyright 2022 WorldQuant University. This content is licensed solely for personal use. Redistribution or publication of this material is strictly prohibited.

  • 085-assignment.ipynb
  • 082-test-driven.ipynb
  • 083-garch.ipynb
  • 084-model-deployment.ipynb
xxxxxxxxxx

Usage Guidelines

This lesson is part of the DS Lab core curriculum. For that reason, this notebook can only be used on your WQU virtual machine.

This means:

  • ⓧ No downloading this notebook.
  • ⓧ No re-sharing of this notebook with friends or colleagues.
  • ⓧ No downloading the embedded videos in this notebook.
  • ⓧ No re-sharing embedded videos with friends or colleagues.
  • ⓧ No adding this notebook to public or private repositories.
  • ⓧ No uploading this notebook (or screenshots of it) to other websites, including websites for study resources.

xxxxxxxxxx

8.4. Model Deployment

xxxxxxxxxx

Ready for deployment! Over the last three lessons, we've built all the pieces we need for our application. We have a module for getting and storing our data. We have the code to train our model and clean its predictions. In this lesson, we're going to put all those pieces together and deploy our model with an API that others can use to train their own models and predict volatility. We'll start by creating a model for all the code we created in the last lesson. Then we'll complete our main module, which will hold our FastAPI application with two paths: one for model training and one for prediction. Let's jump in!

[2]:
xxxxxxxxxx
 
%load_ext autoreload
%autoreload 2
​
import os
import sqlite3
from glob import glob
​
import joblib
import pandas as pd
import requests
import wqet_grader
from arch.univariate.base import ARCHModelResult
from config import settings
from data import SQLRepository
from IPython.display import VimeoVideo
​
wqet_grader.init("Project 8 Assessment")
[3]:
 
VimeoVideo("772219745", h="f3bfda20cd", width=600)
[3]:
xxxxxxxxxx

Model Module¶

xxxxxxxxxx

We created a lot of code in the last lesson to building, training, and making predictions with our GARCH(1,1) model. We want this code to be reusable, so let's put it in its own module.

Let's start by instantiating a repository that we'll use for testing our module as we build.

[4]:
 
VimeoVideo("772219717", h="8f1afa7919", width=600)
[4]:
xxxxxxxxxx

Task 8.4.1: Create a SQLRepository named repo. Be sure that it's attached to a SQLite connection.

  • Open a connection to a SQL database using sqlite3.
[5]:
xxxxxxxxxx
 
connection = sqlite3.connect(settings.db_name, check_same_thread = False)
repo = SQLRepository(connection=connection)
​
print("repo type:", type(repo))
print("repo.connection type:", type(repo.connection))
repo type: <class 'data.SQLRepository'>
repo.connection type: <class 'sqlite3.Connection'>
xxxxxxxxxx

Now that we have the repo ready, we'll shift to our model module and build a GarchModel class to hold all our code from the last lesson.

[6]:
 
VimeoVideo("772219669", h="1d225ab776", width=600)
[6]:
xxxxxxxxxx

Task 8.4.2: In the model module, create a definition for a GarchModel model class. For now, it should only have an __init__ method. Use the docstring as a guide. When you're done, test your class using the assert statements below.

  • What's a class?
  • Write a class definition in Python.
  • Write a class method in Python.
  • What's an assert statement?
  • Write an assert statement in Python.
[8]:
 
from model import GarchModel
​
# Instantiate a `GarchModel`
gm_ambuja = GarchModel(ticker="AMBUJACEM.BSE", repo=repo, use_new_data=False)
​
# Does `gm_ambuja` have the correct attributes?
assert gm_ambuja.ticker == "AMBUJACEM.BSE"
assert gm_ambuja.repo == repo
assert not gm_ambuja.use_new_data
assert gm_ambuja.model_directory == settings.model_directory
[9]:
 
VimeoVideo("772219593", h="3f3c401c04", width=600)
[9]:
xxxxxxxxxx

Task 8.4.3: Turn your wrangle_data function from the last lesson into a method for your GarchModel class. When you're done, use the assert statements below to test the method by getting and wrangling data for the department store Shoppers Stop.

  • What's a function?
  • Write a function in Python.
  • Write a class method in Python.
  • What's an assert statement?
  • Write an assert statement in Python.
[10]:
 
# Instantiate `GarchModel`, use new data
model_shop = GarchModel(ticker="SHOPERSTOP.BSE", repo=repo, use_new_data=True)
​
# Check that model doesn't have `data` attribute yet
assert not hasattr(model_shop, "data")
​
# Wrangle data
model_shop.wrangle_data(n_observations=1000)
​
# Does model now have `data` attribute?
assert hasattr(model_shop, "data")
​
# Is the `data` a Series?
assert isinstance(model_shop.data, pd.Series)
​
# Is Series correct shape?
assert model_shop.data.shape == (1000,)
​
model_shop.data.head()
[10]:
date
2019-01-15   -0.136041
2019-01-16    1.002238
2019-01-17   -2.003854
2019-01-18   -0.471884
2019-01-21    1.363098
Name: return, dtype: float64
[11]:
 
VimeoVideo("772219535", h="55fbfdff55", width=600)
[11]:
xxxxxxxxxx

Task 8.4.4: Using your code from the previous lesson, create a fit method for your GarchModel class. When you're done, use the code below to test it.

  • Write a class method in Python.
  • What's an assert statement?
  • Write an assert statement in Python.WQU WorldQuant University Applied Data Science Lab QQQQ
[12]:
xxxxxxxxxx
 
# Instantiate `GarchModel`, use old data
model_shop = GarchModel(ticker="SHOPERSTOP.BSE", repo=repo, use_new_data=False)
​
# Wrangle data
model_shop.wrangle_data(n_observations=1000)
​
# Fit GARCH(1,1) model to data
model_shop.fit(p=1, q=1)
​
# Does `model_shop` have a `model` attribute now?
assert hasattr(model_shop, "model")
​
# Is model correct data type?
assert isinstance(model_shop.model, ARCHModelResult)
​
# Does model have correct parameters?
assert model_shop.model.params.index.tolist() == ["mu", "omega", "alpha[1]", "beta[1]"]
​
# Check model parameters
model_shop.model.summary()
[12]:
Constant Mean - GARCH Model Results
Dep. Variable: return R-squared: 0.000
Mean Model: Constant Mean Adj. R-squared: 0.000
Vol Model: GARCH Log-Likelihood: -2428.86
Distribution: Normal AIC: 4865.73
Method: Maximum Likelihood BIC: 4885.36
No. Observations: 1000
Date: Thu, Jan 26 2023 Df Residuals: 999
Time: 12:12:19 Df Model: 1
Mean Model
coef std err t P>|t| 95.0% Conf. Int.
mu 0.1028 7.712e-02 1.333 0.183 [-4.836e-02, 0.254]
Volatility Model
coef std err t P>|t| 95.0% Conf. Int.
omega 0.1464 0.188 0.781 0.435 [ -0.221, 0.514]
alpha[1] 0.0372 2.485e-02 1.497 0.134 [-1.151e-02,8.590e-02]
beta[1] 0.9468 4.437e-02 21.340 4.838e-101 [ 0.860, 1.034]


Covariance estimator: robust
[13]:
xxxxxxxxxx
 
VimeoVideo("772219489", h="3de8abb0e6", width=600)
[13]:
xxxxxxxxxx

Task 8.4.5: Using your code from the previous lesson, create a predict_volatility method for your GarchModel class. Your method will need to return predictions as a dictionary, so you'll need to add your clean_prediction function as a helper method. When you're done, test your work using the assert statements below.

  • Write a class method in Python.
  • Write a function in Python.
  • What's an assert statement?
  • Write an assert statement in Python.
[16]:
xxxxxxxxxx
 
# Generate prediction from `model_shop`
prediction = model_shop.predict_volatility(horizon=5)
​
# Is prediction a dictionary?
assert isinstance(prediction, dict)
​
# Are keys correct data type?
assert all(isinstance(k, str) for k in prediction.keys())
​
# Are values correct data type?
assert all(isinstance(v, float) for v in prediction.values())
​
prediction
[16]:
{'2023-01-26T00:00:00': 2.1863071831011722,
 '2023-01-27T00:00:00': 2.202192500759229,
 '2023-01-30T00:00:00': 2.217711789211658,
 '2023-01-31T00:00:00': 2.232876702276906,
 '2023-02-01T00:00:00': 2.247698343791981}
xxxxxxxxxx

Things are looking good! There are two last methods that we need to add to our GarchModel so that we can save a trained model and then load it when we need it. When we learned about saving and loading files in Project 5, we used a context handler. This time, we'll streamline the process using the joblib library. We'll also start writing our filepaths more programmatically using the os library.

[17]:
xxxxxxxxxx
 
VimeoVideo("772219427", h="0dd5731a0d", width=600)
[17]:
xxxxxxxxxx

Task 8.4.6: Create a dump method for your GarchModel class. It should save the model assigned to the model attribute to the folder specified in your configuration settings. Use the docstring as a guide, and then test your work below.

  • Write a class method in Python.
  • Save an object using joblib.
  • Create a file path using os.
[20]:
xxxxxxxxxx
 
# Save `model_shop` model, assign filename
filename = model_shop.dump()
​
# Is `filename` a string?
assert isinstance(filename, str)
​
# Does filename include ticker symbol?
assert model_shop.ticker in filename
​
# Does file exist?
assert os.path.exists(filename)
​
filename
[20]:
'models/2023-01-26T12:27:09.812765_SHOPERSTOP.BSE.pkl'
[21]:
xxxxxxxxxx
 
VimeoVideo("772219326", h="4e1f9421e4", width=600)
[21]:
xxxxxxxxxx

Task 8.4.7: Create a load function below that will take a ticker symbol as input and return a model. When you're done, use the next cell to load the Shoppers Stop model you saved in the previous task.

  • Handle errors using try and except blocks in Python.
  • Create a file path using os.
  • Raise an Exception in Python.
[25]:
xxxxxxxxxx
 
def load(ticker):
​
    """Load latest model from model directory.
​
    Parameters
    ----------
    ticker : str
        Ticker symbol for which model was trained.
​
    Returns
    -------
    `ARCHModelResult`
    """
    # Create pattern for glob search
    pattern = os.path.join(settings.model_directory, f"*{ticker}.pkl")
​
    # Try to find path of latest model
    # Handle possible `IndexError`
    try:
        model_path = sorted(glob(pattern))[-1]
    except IndexError:
        raise Exception(f"No model trained for {ticker} ")
        
    # Load model
    model = joblib.load(model_path)
​
    # Return model
    return model
[28]:
xxxxxxxxxx
 
# Assign load output to `model`
model_shop = load(ticker="SHOPERSTOP.BSE")
​
# Does function return an `ARCHModelResult`
assert isinstance(model_shop, ARCHModelResult)
​
# Check model parameters
model_shop.summary()
[28]:
Constant Mean - GARCH Model Results
Dep. Variable: return R-squared: 0.000
Mean Model: Constant Mean Adj. R-squared: 0.000
Vol Model: GARCH Log-Likelihood: -2428.86
Distribution: Normal AIC: 4865.73
Method: Maximum Likelihood BIC: 4885.36
No. Observations: 1000
Date: Thu, Jan 26 2023 Df Residuals: 999
Time: 12:12:19 Df Model: 1
Mean Model
coef std err t P>|t| 95.0% Conf. Int.
mu 0.1028 7.712e-02 1.333 0.183 [-4.836e-02, 0.254]
Volatility Model
coef std err t P>|t| 95.0% Conf. Int.
omega 0.1464 0.188 0.781 0.435 [ -0.221, 0.514]
alpha[1] 0.0372 2.485e-02 1.497 0.134 [-1.151e-02,8.590e-02]
beta[1] 0.9468 4.437e-02 21.340 4.838e-101 [ 0.860, 1.034]


Covariance estimator: robust
[29]:
xxxxxxxxxx
 
VimeoVideo("772219392", h="deed99bf85", width=600)
[29]:
xxxxxxxxxx

Task 8.4.8: Transform your load function into a method for your GarchModel class. When you're done, test the method using the assert statements below.

  • Write a class method in Python.
  • What's an assert statement?
  • Write an assert statement in Python.
[31]:
xxxxxxxxxx
 
model_shop = GarchModel(ticker="SHOPERSTOP.BSE", repo=repo, use_new_data=False)
​
# Check that new `model_shop_test` doesn't have model attached
assert not hasattr(model_shop, "model")
​
# Load model
model_shop.load()
​
# Does `model_shop_test` have model attached?
assert hasattr(model_shop, "model")
​
model_shop.model.summary()
[31]:
Constant Mean - GARCH Model Results
Dep. Variable: return R-squared: 0.000
Mean Model: Constant Mean Adj. R-squared: 0.000
Vol Model: GARCH Log-Likelihood: -2428.86
Distribution: Normal AIC: 4865.73
Method: Maximum Likelihood BIC: 4885.36
No. Observations: 1000
Date: Thu, Jan 26 2023 Df Residuals: 999
Time: 12:12:19 Df Model: 1
Mean Model
coef std err t P>|t| 95.0% Conf. Int.
mu 0.1028 7.712e-02 1.333 0.183 [-4.836e-02, 0.254]
Volatility Model
coef std err t P>|t| 95.0% Conf. Int.
omega 0.1464 0.188 0.781 0.435 [ -0.221, 0.514]
alpha[1] 0.0372 2.485e-02 1.497 0.134 [-1.151e-02,8.590e-02]
beta[1] 0.9468 4.437e-02 21.340 4.838e-101 [ 0.860, 1.034]


Covariance estimator: robust
xxxxxxxxxx

Our model module is done! Now it's time to move on to the "main" course and add the final piece to our application.

xxxxxxxxxx

Main Module¶

xxxxxxxxxx

Similar to the interactive applications we made in Projects 6 and 7, our first step here will be to create an app object. This time, instead of being a plotly application, it'll be a FastAPI application.

[32]:
xxxxxxxxxx
 
VimeoVideo("772219283", h="2cd1d97516", width=600)
[32]:
xxxxxxxxxx

Task 8.4.9: In the main module, instantiate a FastAPI application named app.

  • Instantiate an application in FastAPI.
xxxxxxxxxx

In order for our app to work, we need to run it on a server. In this case, we'll run the server on our virtual machine using the uvicorn library.

[33]:
xxxxxxxxxx
 
VimeoVideo("772219237", h="5ee74f82db", width=600)
[33]:
xxxxxxxxxx

Task 8.4.10: Go to the command line, navigate to the directory for this project, and start your app server by entering the following command.

uvicorn main:app --reload --workers 1 --host localhost --port 8008
xxxxxxxxxx

Remember how the AlphaVantage API had a "/query" path that we accessed using a get HTTP request? We're going to build similar paths for our application. Let's start with an MVP example so we can learn how paths work in FastAPI.

[34]:
xxxxxxxxxx
 
VimeoVideo("772219175", h="6f53c61020", width=600)
[34]:
xxxxxxxxxx

Task 8.4.11: Create a "/hello" path for your app that returns a greeting when it receives a get request.

  • Create an application path in FastAPI.
xxxxxxxxxx

We've got our path. Let's perform as get request to see if it works.

[35]:
xxxxxxxxxx
 
VimeoVideo("772219134", h="09a4b98413", width=600)
[35]:
xxxxxxxxxx

Task 8.4.12: Create a get request to hit the "/hello" path running at "http://localhost:8008".

  • What's an HTTP request?
  • Make an HTTP request using requests.
[36]:
xxxxxxxxxx
 
url = "http://localhost:8008/hello"
response = requests.get(url)
​
print("response code:", response.status_code)
response.json()
response code: 200
[36]:
{'message': 'hello world!'}
xxxxxxxxxx

Excellent! Now let's start building the fun stuff.

xxxxxxxxxx

"/fit" Path¶

xxxxxxxxxx

Our first path will allow the user to fit a model to stock data when they make a post request to our server. They'll have the choice to use new data from AlphaVantage, or older data that's already in our database. When a user makes a request, they'll receive a response telling them if the operation was successful or whether there was an error.

One thing that's very important when building an API is making sure the user passes the correct parameters into the app. Otherwise, our app could crash! FastAPI works well with the pydantic library, which checks that each request has the correct parameters and data types. It does this by using special data classes that we need to define. Our "/fit" path will take user input and then output a response, so we need two classes: one for input and one for output.

[37]:
xxxxxxxxxx
 
VimeoVideo("772219078", h="4f016b11e1", width=600)
[37]:
xxxxxxxxxx

Task 8.4.13: Create definitions for a FitIn and a FitOut data class. The FitIn class should inherit from the pydantic BaseClass, and the FitOut class should inherit from the FitIn class. Be sure to include type hints.

  • Write a class definition in Python.
  • What's class inheritance?
  • What are type hints?
  • Define a data model in pydantic.
xxxxxxxxxx

With our data classes defined, let's see how pydantic ensures our that users are supplying the correct input and our application is returning the correct output.

[38]:
xxxxxxxxxx
 
VimeoVideo("772219008", h="ad1114eb9e", width=600)
[38]:
xxxxxxxxxx

Task 8.4.14: Use the code below to experiment with your FitIn and FitOut classes. Under what circumstances does instantiating them throw errors? What class or classes are they instances of?

  • What's class inheritance?
  • What are type hints?
  • Define a data model in pydantic.
[43]:
xxxxxxxxxx
 
from main import FitIn, FitOut
​
# Instantiate `FitIn`. Play with parameters.
fi = FitIn(
    ticker="SHOPERSTOP.BSE",
    use_new_data=True,
    n_observations=2000,
    p=1,
    q=1
)
    
print(fi)
​
# Instantiate `FitOut`. Play with parameters.
fo = FitOut(
    ticker="SHOPERSTOP.BSE",
    use_new_data=True,
    n_observations=2000,
    p=1,
    q=1,
    success=True,
    message="Model is ready"
)
print(fo)
ticker='SHOPERSTOP.BSE' p=1 q=1 n_observations=2000 use_new_data=True
ticker='SHOPERSTOP.BSE' p=1 q=1 n_observations=2000 use_new_data=True success=True message='Model is ready'
xxxxxxxxxx

One cool feature of FastAPI is that it can work in asynchronous scenarios. That's not something we need to learn for this project, but it does mean that we need to instantiate a GarchModel object each time a user makes a request. To make the coding easier for us, let's make a function to handle that process for us.

[44]:
xxxxxxxxxx
 
VimeoVideo("772218958", h="37744c9d88", width=600)
[44]:
xxxxxxxxxx

Task 8.4.15: Create a build_model function in your main module. Use the docstring as a guide, and test your function below.

  • What's a function?
  • Write a function in Python.
  • What's an assert statement?
  • Write an assert statement in Python.
[46]:
xxxxxxxxxx
 
from main import build_model
​
# Instantiate `GarchModel` with function
model_shop = build_model(ticker="SHOPERSTOP.BSE", use_new_data=False)
​
# Is `SQLRepository` attached to `model_shop`?
assert isinstance(model_shop.repo, SQLRepository)
​
# Is SQLite database attached to `SQLRepository`
assert isinstance(model_shop.repo.connection, sqlite3.Connection)
​
# Is `ticker` attribute correct?
assert model_shop.ticker == "SHOPERSTOP.BSE"
​
# Is `use_new_data` attribute correct?
assert not model_shop.use_new_data
​
model_shop
[46]:
<model.GarchModel at 0x7f69a0e12af0>
xxxxxxxxxx

We've got data classes, we've got a build_model function, and all that's left is to build the "/fit" path. We'll use our "/hello" path as a template, but we'll need to include more features, like error handling.

[47]:
xxxxxxxxxx
 
VimeoVideo("772218892", h="6779ee3470", width=600)
[47]:
xxxxxxxxxx

Task 8.4.16: Create a "/fit" path for your app. It will take a FitIn object as input, and then build a GarchModel using the build_model function. The model will wrangle the needed data, fit to the data, and save the completed model. Finally, it will send a response in the form of a FitOut object. Be sure to handle any errors that may arise.

  • Create an application path in FastAPI.
xxxxxxxxxx

Last step! Let's make a post request and see how our app responds.

[48]:
xxxxxxxxxx
 
VimeoVideo("772218833", h="6d27fb4539", width=600)
[48]:
xxxxxxxxxx

Task 8.4.17: Create a post request to hit the "/fit" path running at "http://localhost:8008". You should train a GARCH(1,1) model on 2000 observations of the Shoppers Stop data you already downloaded. Pass in your parameters as a dictionary using the json argument.

  • What's an argument?
  • What's an HTTP request?
  • Make an HTTP request using requests.
[49]:
x
# URL of `/fit` path
url = "http://localhost:8008/fit"
​
# Data to send to path
json = {
    "ticker":"SHOPERSTOP.BSE",
    "use_new_data":False,
    "n_observations":2000,
    "p":1,
    "q":1
}
# Response of post request
response = requests.post(url=url, json=json)
# Inspect response
print("response code:", response.status_code)
response.json()
response code: 200
[49]:
{'ticker': 'SHOPERSTOP.BSE',
 'p': 1,
 'q': 1,
 'n_observations': 2000,
 'use_new_data': False,
 'succcess': True,
 'message': 'Trained and save models/2023-01-26T13:21:28.306362_SHOPERSTOP.BSE.pkl'}
xxxxxxxxxx

Boom! Now we can train models using the API we created. Up next: a path for making predictions.

xxxxxxxxxx

"/predict" Path¶

xxxxxxxxxx

For our "/predict" path, users will be able to make a post request with the ticker symbol they want a prediction for and the number of days they want to forecast into the future. Our app will return a forecast or, if there's an error, a message explaining the problem.

The setup will be very similar to our "/fit" path. We'll start with data classes for the in- and output.

[50]:
xxxxxxxxxx
 
VimeoVideo("772218808", h="3a73624069", width=600)
[50]:
xxxxxxxxxx

Task 8.4.18: Create definitions for a PredictIn and PredictOut data class. The PredictIn class should inherit from the pydantic BaseModel, and the PredictOut class should inherit from the PredictIn class. Be sure to include type hints. The use the code below to test your classes.

  • Write a class definition in Python.
  • What's class inheritance?
  • What are type hints?
  • Define a data model in pydantic.
[57]:
xxxxxxxxxx
 
from main import PredictIn, PredictOut
​
pi = PredictIn(ticker="SHOPERSTOP.BSE", n_days=5)
print(pi)
​
po = PredictOut(
    ticker="SHOPERSTOP.BSE", n_days=5, success=True, forecast={}, message="success"
)
print(po)
ticker='SHOPERSTOP.BSE' n_days=5
ticker='SHOPERSTOP.BSE' n_days=5 success=True forecast={} message='success'
xxxxxxxxxx

Up next, let's create the path. The good news is that we'll be able to reuse our build_model function.

[51]:
xxxxxxxxxx
 
VimeoVideo("772218740", h="ff06859ece", width=600)
[51]:
xxxxxxxxxx

Task 8.4.19: Create a "/predict" path for your app. It will take a PredictIn object as input, build a GarchModel, load the most recent trained model for the given ticker, and generate a dictionary of predictions. It will then return a PredictOut object with the predictions included. Be sure to handle any errors that may arise.

  • Create an application path in FastAPI.
xxxxxxxxxx

Last step, let's see what happens when we make a post request...

[59]:
xxxxxxxxxx
 
VimeoVideo("772218642", h="1da744b9e7", width=600)
[59]:
xxxxxxxxxx

Task 8.4.20: Create a post request to hit the "/predict" path running at "http://localhost:8008". You should get the 5-day volatility forecast for Shoppers Stop. When you're satisfied, submit your work to the grader.

  • What's an HTTP request?
  • Make an HTTP request using requests.
[68]:
xxxxxxxxxx
 
# URL of `/predict` path
url = "http://localhost:8008/predict"
# Data to send to path
json = {
    "ticker":"SHOPERSTOP.BSE",
    "n_days":5,
}
# Response of post request
response = requests.post(url=url, json=json)
# Response JSON to be submitted to grader
submission = response.json()
# Inspect JSON
submission
[68]:
{'ticker': 'SHOPERSTOP.BSE',
 'n_days': 5,
 'success': True,
 'forecast': {'2023-01-26T00:00:00': 1.9746357858800914,
  '2023-01-27T00:00:00': 1.988421386318225,
  '2023-01-30T00:00:00': 2.0018867616082825,
  '2023-01-31T00:00:00': 2.015042027695701,
  '2023-02-01T00:00:00': 2.027896831972314},
 'message': ' '}
[67]:
xxxxxxxxxx
 
wqet_grader.grade("Project 8 Assessment", "Task 8.4.20", submission)

You're making this look easy. 😉

Score: 1

xxxxxxxxxx

We did it! Better said, you did it. You got data from the AlphaVantage API, you stored it in a SQL database, you built and trained a GARCH model to predict volatility, and you created your own API to serve predictions from your model. That's data engineering, data science, and model deployment all in one project. If you haven't already, now's a good time to give yourself a pat on the back. You definitely deserve it.

xxxxxxxxxx

Copyright 2022 WorldQuant University. This content is licensed solely for personal use. Redistribution or publication of this material is strictly prohibited.

xxxxxxxxxx
​

Usage Guidelines

This lesson is part of the DS Lab core curriculum. For that reason, this notebook can only be used on your WQU virtual machine.

This means:

  • ⓧ No downloading this notebook.
  • ⓧ No re-sharing of this notebook with friends or colleagues.
  • ⓧ No downloading the embedded videos in this notebook.
  • ⓧ No re-sharing embedded videos with friends or colleagues.
  • ⓧ No adding this notebook to public or private repositories.
  • ⓧ No uploading this notebook (or screenshots of it) to other websites, including websites for study resources.

xxxxxxxxxx
<font size="+3"><strong>8.2. Test Driven Development</strong></font>

8.2. Test Driven Development

xxxxxxxxxx
In the previous lesson, we learned how to get data from an API. In this lesson, we have two goals. First, we'll take the code we used to access the API and build an `AlphaVantageAPI` class. This will allow us to reuse our code. Second, we'll create a `SQLRepository` class that will help us load our stock data into a SQLite database and then extract it for later use. Additionally, we'll build this code using a technique called **test driven development**, where we'll use `assert` statements to make sure everything is working properly. That way, we'll avoid issues later when we build our application.

In the previous lesson, we learned how to get data from an API. In this lesson, we have two goals. First, we'll take the code we used to access the API and build an AlphaVantageAPI class. This will allow us to reuse our code. Second, we'll create a SQLRepository class that will help us load our stock data into a SQLite database and then extract it for later use. Additionally, we'll build this code using a technique called test driven development, where we'll use assert statements to make sure everything is working properly. That way, we'll avoid issues later when we build our application.

[37]:
xxxxxxxxxx
 
%load_ext autoreload
%load_ext sql
%autoreload 2
​
import sqlite3
​
import matplotlib.pyplot as plt
import pandas as pd
import wqet_grader
from config import settings
from IPython.display import VimeoVideo
​
wqet_grader.init("Project 8 Assessment")
The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
The sql extension is already loaded. To reload it, use:
  %reload_ext sql
[2]:
 
VimeoVideo("764766424", h="88dbe3bff8", width=600)
[2]:
xxxxxxxxxx
# Building Our Data Module

Building Our Data Module¶

xxxxxxxxxx
For our application, we're going to keep all the classes we use to extract, transform, and load data in a single module that we'll call `data`.

For our application, we're going to keep all the classes we use to extract, transform, and load data in a single module that we'll call data.

xxxxxxxxxx
## AlphaVantage API Class

AlphaVantage API Class¶

xxxxxxxxxx
Let's get started by taking the code we created in the last lesson and incorporating it into a class that will be in charge of getting data from the AlphaVantage API.

Let's get started by taking the code we created in the last lesson and incorporating it into a class that will be in charge of getting data from the AlphaVantage API.

[3]:
 
VimeoVideo("764766399", h="08b6a61e84", width=600)
[3]:
xxxxxxxxxx
**Task 8.2.1:** In the `data` module, create a class definition for `AlphaVantageAPI`. For now, making sure that it has an `__init__` method that attaches your API key as the attribute `__api_key`. Once you're done, import the class below and create an instance of it called `av`.

Task 8.2.1: In the data module, create a class definition for AlphaVantageAPI. For now, making sure that it has an __init__ method that attaches your API key as the attribute __api_key. Once you're done, import the class below and create an instance of it called av.

  • What's a class?
  • Write a class definition in Python.
  • Write a class method in Python.
[7]:
 
# Import `AlphaVantageAPI`
from data import AlphaVantageAPI
# Create instance of `AlphaVantageAPI` class
av = AlphaVantageAPI()
​
print("av type:", type(av))
av type: <class 'data.AlphaVantageAPI'>
xxxxxxxxxx
Remember the `get_daily` function we made in the last lesson? Now we're going to turn it into a class method.

Remember the get_daily function we made in the last lesson? Now we're going to turn it into a class method.

[8]:
 
VimeoVideo("764766380", h="5b4cf7c753", width=600)
[8]:
xxxxxxxxxx
**Task 8.2.2:** Create a `get_daily` method for your `AlphaVantageAPI` class. Once you're done, use the cell below to fetch the stock data for the renewable energy company [Suzlon](https://www.suzlon.com/) and assign it to the DataFrame `df_suzlon`.

Task 8.2.2: Create a get_daily method for your AlphaVantageAPI class. Once you're done, use the cell below to fetch the stock data for the renewable energy company Suzlon and assign it to the DataFrame df_suzlon.

  • Write a class method in Python.
[9]:
 
# Define Suzlon ticker symbol
ticker = "SUZLON.BSE"
​
# Use your `av` object to get daily data
df_suzlon = av.get_daily(ticker)
​
print("df_suzlon type:", type(df_suzlon))
print("df_suzlon shape:", df_suzlon.shape)
df_suzlon.head()
df_suzlon type: <class 'pandas.core.frame.DataFrame'>
df_suzlon shape: (4253, 5)
[9]:
open high low close volume
date
2023-01-25 9.66 9.75 9.41 9.49 23989499.0
2023-01-24 9.80 9.84 9.61 9.65 17695141.0
2023-01-23 10.12 10.13 9.70 9.76 21734343.0
2023-01-20 9.65 10.24 9.65 10.03 46167723.0
2023-01-19 9.70 9.79 9.63 9.68 24547124.0
xxxxxxxxxx
Okay! The next thing we need to do is test our new method to make sure it works the way we want it to. Usually, these sorts of tests are written *before* writing the method, but, in this first case, we'll do it the other way around in order to get a better sense of how assert statements work.

Okay! The next thing we need to do is test our new method to make sure it works the way we want it to. Usually, these sorts of tests are written before writing the method, but, in this first case, we'll do it the other way around in order to get a better sense of how assert statements work.

[10]:
 
VimeoVideo("764766326", h="3ffc1a1a2f", width=600)
[10]:
xxxxxxxxxx
**Task 8.2.3:** Create four assert statements to test the output of your `get_daily` method. Use the comments below as a guide.

Task 8.2.3: Create four assert statements to test the output of your get_daily method. Use the comments below as a guide.

  • What's an assert statement?
  • Write an assert statement in Python.
[11]:
 
# Does `get_daily` return a DataFrame?
assert isinstance(df_suzlon, pd.DataFrame)
# Does DataFrame have 5 columns?
assert df_suzlon.shape[1] == 5
# Does DataFrame have a DatetimeIndex?
assert isinstance(df_suzlon.index, pd.DatetimeIndex)
# Is the index name "date"?
assert df_suzlon.index.name == "date"
[12]:
xxxxxxxxxx
 
VimeoVideo("764766298", h="282ced7752", width=600)
[12]:
xxxxxxxxxx
**Task 8.2.4:** Create two more tests for the output of your `get_daily` method. Use the comments below as a guide.

Task 8.2.4: Create two more tests for the output of your get_daily method. Use the comments below as a guide.

  • What's an assert statement?
  • Write an assert statement in Python.
[20]:
xxxxxxxxxx
 
# Does DataFrame have correct column names?
assert df_suzlon.columns.to_list() == ['open', 'high', 'low', 'close', 'volume']
​
# Are columns correct data type?
assert all(df_suzlon.dtypes == float)
xxxxxxxxxx
Okay! Now that our `AlphaVantageAPI` is ready to get data, let's turn our focus to the class we'll need for storing our data in our SQLite database.<span style='color: transparent; font-size:1%'>WQU WorldQuant University Applied Data Science Lab QQQQ</span>

Okay! Now that our AlphaVantageAPI is ready to get data, let's turn our focus to the class we'll need for storing our data in our SQLite database.WQU WorldQuant University Applied Data Science Lab QQQQ

xxxxxxxxxx
## SQL Repository Class

SQL Repository Class¶

xxxxxxxxxx
It wouldn't be efficient if our application needed to get data from the AlphaVantage API every time we wanted to explore our data or build a model, so we'll need to store our data in a database. Because our data is highly structured (each DataFrame we extract from AlphaVantage is always going to have the same five columns), it makes sense to use a SQL database.

It wouldn't be efficient if our application needed to get data from the AlphaVantage API every time we wanted to explore our data or build a model, so we'll need to store our data in a database. Because our data is highly structured (each DataFrame we extract from AlphaVantage is always going to have the same five columns), it makes sense to use a SQL database.

We'll use SQLite for our database. For consistency, this database will always have the same name, which we've stored in our .env file.

[21]:
xxxxxxxxxx
 
VimeoVideo("764766285", h="7b6487a28d", width=600)
[21]:
xxxxxxxxxx
**Task 8.2.5:** Connect to the database whose name is stored in the `.env` file for this project. Be sure to set the `check_same_thread` argument to `False`. Assign the connection to the variable `connection`.

Task 8.2.5: Connect to the database whose name is stored in the .env file for this project. Be sure to set the check_same_thread argument to False. Assign the connection to the variable connection.

  • Open a connection to a SQL database using sqlite3.
[22]:
xxxxxxxxxx
 
connection = sqlite3.connect(database= settings.db_name, check_same_thread = False)
​
print("connection type:", type(connection))
connection type: <class 'sqlite3.Connection'>
xxxxxxxxxx
We've got a connection, and now we need to start building the class that will handle all our transactions with the database. With this class, though, we're going to create our tests *before* writing the class definition.

We've got a connection, and now we need to start building the class that will handle all our transactions with the database. With this class, though, we're going to create our tests before writing the class definition.

[23]:
xxxxxxxxxx
 
VimeoVideo("764766249", h="4359c98af4", width=600)
[23]:
xxxxxxxxxx
**Task 8.2.6:** Write two tests for the `SQLRepository` class, using the comments below as a guide.

Task 8.2.6: Write two tests for the SQLRepository class, using the comments below as a guide.

  • What's an assert statement?
  • Write an assert statement in Python.
[26]:
xxxxxxxxxx
 
# Import class definition
from data import SQLRepository
​
# Create instance of class
repo = SQLRepository(connection =connection)
​
# Does `repo` have a "connection" attribute?
assert hasattr(repo, "connection")
​
# Is the "connection" attribute a SQLite `Connection`?
assert isinstance(repo.connection, sqlite3.Connection)
xxxxxxxxxx
<div class="alert alert-info" role="alert">

Tip: You won't be able to run this ☝️ code block until you complete the task below. 👇

[25]:
xxxxxxxxxx
 
VimeoVideo("764766224", h="71655b61c2", width=600)
[25]:
xxxxxxxxxx
**Task 8.2.7:** Create a definition for your `SQLRepository` class. For now, just complete the `__init__` method. Once you're done, use the code you wrote in the previous task to test it.

Task 8.2.7: Create a definition for your SQLRepository class. For now, just complete the __init__ method. Once you're done, use the code you wrote in the previous task to test it.

  • What's a class?
  • Write a class definition in Python.
  • Write a class method in Python.
xxxxxxxxxx
The next method we need for the `SQLRepository` class is one that allows us to store information. In SQL talk, this is generally referred to as **inserting** tables into the database.

The next method we need for the SQLRepository class is one that allows us to store information. In SQL talk, this is generally referred to as inserting tables into the database.

[27]:
xxxxxxxxxx
 
VimeoVideo("764766175", h="6d2f030425", width=600)
[27]:
xxxxxxxxxx
**Task 8.2.8:** Add an `insert_table` method to your `SQLRepository` class. As a guide use the assert statements below and the docstring in the `data` module. When you're done, run the cell below to check your work.

Task 8.2.8: Add an insert_table method to your SQLRepository class. As a guide use the assert statements below and the docstring in the data module. When you're done, run the cell below to check your work.

  • Write a class method in Python.
[28]:
xxxxxxxxxx
 
response = repo.insert_table(table_name=ticker, records=df_suzlon, if_exists="replace")
​
# Does your method return a dictionary?
assert isinstance(response, dict)
​
# Are the keys of that dictionary correct?
assert sorted(list(response.keys())) == ["records_inserted", "transaction_successful"]
xxxxxxxxxx
If our method is passing the assert statements, we know it's returning a record of the database transaction, but we still need to check whether the data has actually been added to the database.

If our method is passing the assert statements, we know it's returning a record of the database transaction, but we still need to check whether the data has actually been added to the database.

[29]:
xxxxxxxxxx
 
VimeoVideo("764766150", h="80fc271c75", width=600)
[29]:
xxxxxxxxxx
**Task 8.2.9:** Write a SQL query to get the **first five rows** of the table of Suzlon data you just inserted into the database.

Task 8.2.9: Write a SQL query to get the first five rows of the table of Suzlon data you just inserted into the database.

  • Write a basic query in SQL.
[30]:
xxxxxxxxxx
 
%load_ext sql
%sql sqlite:////home/jovyan/work/ds-curriculum/080-volatility-forecasting-in-india/stocks.sqlite
The sql extension is already loaded. To reload it, use:
  %reload_ext sql
[30]:
'Connected: @/home/jovyan/work/ds-curriculum/080-volatility-forecasting-in-india/stocks.sqlite'
[32]:
xxxxxxxxxx
 
%%sql
select *
from 'SUZLON.BSE'
limit 5
​
 * sqlite:////home/jovyan/work/ds-curriculum/080-volatility-forecasting-in-india/stocks.sqlite
Done.
[32]:
date open high low close volume
2023-01-25 00:00:00 9.66 9.75 9.41 9.49 23989499.0
2023-01-24 00:00:00 9.8 9.84 9.61 9.65 17695141.0
2023-01-23 00:00:00 10.12 10.13 9.7 9.76 21734343.0
2023-01-20 00:00:00 9.65 10.24 9.65 10.03 46167723.0
2023-01-19 00:00:00 9.7 9.79 9.63 9.68 24547124.0
xxxxxxxxxx
We can get **insert** data into our database, but let's not forget that we need to **read** data from it, too. Reading will be a little more complex than inserting, so let's start by writing code in this notebook before we incorporate it into our `SQLRepository` class.

We can get insert data into our database, but let's not forget that we need to read data from it, too. Reading will be a little more complex than inserting, so let's start by writing code in this notebook before we incorporate it into our SQLRepository class.

[33]:
xxxxxxxxxx
 
VimeoVideo("764766109", h="d04a7a3f9f", width=600)
[33]:
xxxxxxxxxx
**Task 8.2.10:** First, write a SQL query to get **all** the Suzlon data. Then use pandas to extract the data from the database and read it into a DataFrame, names `df_suzlon_test`.

Task 8.2.10: First, write a SQL query to get all the Suzlon data. Then use pandas to extract the data from the database and read it into a DataFrame, names df_suzlon_test.

  • Write a basic query in SQL.
  • Read SQL query into a DataFrame using pandas.
[39]:
xxxxxxxxxx
 
sql = "select * from 'SUZLON.BSE'"
df_suzlon_test = pd.read_sql(sql, con=connection, parse_dates=["date"], index_col ="date")
​
print("df_suzlon_test type:", type(df_suzlon_test))
print()
print(df_suzlon_test.info())
df_suzlon_test.head()
df_suzlon_test type: <class 'pandas.core.frame.DataFrame'>

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 4253 entries, 2023-01-25 to 2005-10-20
Data columns (total 5 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   open    4253 non-null   float64
 1   high    4253 non-null   float64
 2   low     4253 non-null   float64
 3   close   4253 non-null   float64
 4   volume  4253 non-null   float64
dtypes: float64(5)
memory usage: 199.4 KB
None
[39]:
open high low close volume
date
2023-01-25 9.66 9.75 9.41 9.49 23989499.0
2023-01-24 9.80 9.84 9.61 9.65 17695141.0
2023-01-23 10.12 10.13 9.70 9.76 21734343.0
2023-01-20 9.65 10.24 9.65 10.03 46167723.0
2023-01-19 9.70 9.79 9.63 9.68 24547124.0
xxxxxxxxxx
Now that we know how to read a table from our database, let's turn our code into a proper function. But since we're doing backwards designs, we need to start with our tests.

Now that we know how to read a table from our database, let's turn our code into a proper function. But since we're doing backwards designs, we need to start with our tests.

[38]:
xxxxxxxxxx
 
VimeoVideo("764772699", h="6d97cff2e8", width=600)
[38]:
xxxxxxxxxx
**Task 8.2.11:** <a id="task-8211"></a>Complete the assert statements below to test your `read_table` function. Use the comments as a guide.

Task 8.2.11: Complete the assert statements below to test your read_table function. Use the comments as a guide.

  • What's an assert statement?
  • Write an assert statement in Python.
[55]:
xxxxxxxxxx
 
# Assign `read_table` output to `df_suzlon`
df_suzlon = repo.read_table(table_name="SUZLON.BSE", limit=2500)  # noQA F821
​
# Is `df_suzlon` a DataFrame?
assert isinstance(df_suzlon, pd.DataFrame)
​
# Does it have a `DatetimeIndex`?
assert isinstance(df_suzlon.index, pd.DatetimeIndex)
​
# Is the index named "date"?
assert df_suzlon.index.name == "date"
​
# Does it have 2,500 rows and 5 columns?
assert df_suzlon.shape == (2500,5)
# Are the column names correct?
assert df_suzlon.columns.to_list() == ['open', 'high', 'low', 'close', 'volume']
​
# Are the column data types correct?
assert all(df_suzlon.dtypes == float)
​
# Print `df_suzlon` info
print("df_suzlon shape:", df_suzlon.shape)
print()
print(df_suzlon.info())
df_suzlon.head()
df_suzlon shape: (2500, 5)

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 2500 entries, 2023-01-25 to 2012-12-04
Data columns (total 5 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   open    2500 non-null   float64
 1   high    2500 non-null   float64
 2   low     2500 non-null   float64
 3   close   2500 non-null   float64
 4   volume  2500 non-null   float64
dtypes: float64(5)
memory usage: 117.2 KB
None
[55]:
open high low close volume
date
2023-01-25 9.66 9.75 9.41 9.49 23989499.0
2023-01-24 9.80 9.84 9.61 9.65 17695141.0
2023-01-23 10.12 10.13 9.70 9.76 21734343.0
2023-01-20 9.65 10.24 9.65 10.03 46167723.0
2023-01-19 9.70 9.79 9.63 9.68 24547124.0
xxxxxxxxxx
<div class="alert alert-info" role="alert">

Tip: You won't be able to run this ☝️ code block until you complete the task below. 👇

[41]:
xxxxxxxxxx
 
VimeoVideo("764772667", h="afbd47543a", width=600)
[41]:
xxxxxxxxxx
**Task 8.2.12:** Expand on the code you're written above to complete the `read_table` function below. Use the docstring as a guide.

Task 8.2.12: Expand on the code you're written above to complete the read_table function below. Use the docstring as a guide.

  • What's a function?
  • Write a function in Python.
  • Write a basic query in SQL.
xxxxxxxxxx
<div class="alert alert-info" role="alert">

Tip: Remember that we stored our data sorted descending by date. It'll definitely make our read_table easier to implement!

[50]:
xxxxxxxxxx
 
def read_table(table_name, limit=None):
​
    """Read table from database.
​
    Parameters
    ----------
    table_name : str
        Name of table in SQLite database.
    limit : int, None, optional
        Number of most recent records to retrieve. If `None`, all
        records are retrieved. By default, `None`.
​
    Returns
    -------
    pd.DataFrame
        Index is DatetimeIndex "date". Columns are 'open', 'high',
        'low', 'close', and 'volume'. All columns are numeric.
    """
    # Create SQL query (with optional limit)
    if limit:
        sql = f"select * from '{table_name}' limit {limit}"
    else:
        sql = f"select * from '{table_name}'"
​
    # Retrieve data, read into DataFrame
    df = pd.read_sql(sql, con=connection, parse_dates=["date"], index_col ="date")
​
    # Return DataFrame
    return df
[52]:
xxxxxxxxxx
 
VimeoVideo("764772652", h="9f89b8c66e", width=600)
[52]:
xxxxxxxxxx
**Task 8.2.13:** Turn the `read_table` function into a method for your `SQLRepository` class.

Task 8.2.13: Turn the read_table function into a method for your SQLRepository class.

  • Write a class method in Python.
[53]:
xxxxxxxxxx
 
VimeoVideo("764772632", h="3e374abcc3", width=600)
[53]:
xxxxxxxxxx
**Task 8.2.14:** Return to task <a href="#task-8211">Task 8.2.11</a> and change the code so that you're testing your class method instead of your notebook function.

Task 8.2.14: Return to task Task 8.2.11 and change the code so that you're testing your class method instead of your notebook function.

  • What's an assert statement?
  • Write an assert statement in Python.
xxxxxxxxxx
Excellent! We have everything we need to get data from AlphaVantage, save that data in our database, and access it later on. Now it's time to do a little exploratory analysis to compare the stocks of the two companies we have data for. 

Excellent! We have everything we need to get data from AlphaVantage, save that data in our database, and access it later on. Now it's time to do a little exploratory analysis to compare the stocks of the two companies we have data for.

xxxxxxxxxx
# Comparing Stock Returns

Comparing Stock Returns¶

xxxxxxxxxx
We already have the data for Suzlon Energy in our database, but we need to add the data for Ambuja Cement before we can compare the two stocks.

We already have the data for Suzlon Energy in our database, but we need to add the data for Ambuja Cement before we can compare the two stocks.

[56]:
xxxxxxxxxx
 
VimeoVideo("764772620", h="d635a99b74", width=600)
[56]:
xxxxxxxxxx
**Task 8.2.15:** Use the instances of the `AlphaVantageAPI` and `SQLRepository` classes you created in this lesson (`av` and `repo`, respectively) to get the stock data for Ambuja Cement and read it into the database.

Task 8.2.15: Use the instances of the AlphaVantageAPI and SQLRepository classes you created in this lesson (av and repo, respectively) to get the stock data for Ambuja Cement and read it into the database.

  • Write a basic query in SQL.
  • Read SQL query into a DataFrame using pandas.
[60]:
xxxxxxxxxx
 
ticker = "AMBUJACEM.BSE"
​
# Get Ambuja data using `av`
ambuja_records = av.get_daily(ticker)
​
# Insert `ambuja_records` database using `repo`
response = repo.insert_table(
    table_name=ticker, records=ambuja_records, if_exists="replace"
)
​
response
[60]:
{'transaction_successful': True, 'records_inserted': 4452}
xxxxxxxxxx
Let's take a look at the data to make sure we're getting what we need.

Let's take a look at the data to make sure we're getting what we need.

[61]:
xxxxxxxxxx
 
VimeoVideo("764772601", h="f0be0fbb1a", width=600)
[61]:
xxxxxxxxxx
**Task 8.2.16:** Using the `read_table` method you've added to your `SQLRepository`, extract the most recent 2,500 rows of data for Ambuja Cement from the database and assign the result to `df_ambuja`.

Task 8.2.16: Using the read_table method you've added to your SQLRepository, extract the most recent 2,500 rows of data for Ambuja Cement from the database and assign the result to df_ambuja.

  • Write a basic query in SQL.
  • Read SQL query into a DataFrame using pandas.
[62]:
xxxxxxxxxx
 
ticker = "AMBUJACEM.BSE"
df_ambuja = repo.read_table(table_name=ticker, limit=2500)
​
print("df_ambuja type:", type(df_ambuja))
print("df_ambuja shape:", df_ambuja.shape)
df_ambuja.head()
df_ambuja type: <class 'pandas.core.frame.DataFrame'>
df_ambuja shape: (2500, 5)
[62]:
open high low close volume
date
2023-01-24 501.20 508.55 497.55 498.55 100346.0
2023-01-23 517.40 518.45 498.55 500.90 126483.0
2023-01-20 519.05 522.70 515.30 517.20 55838.0
2023-01-19 518.50 525.40 517.30 519.00 82121.0
2023-01-18 516.65 522.00 513.00 519.90 82300.0
xxxxxxxxxx
We've spent a lot of time so far looking at this data, but what does it actually represent? It turns out the stock market is a lot like any other market: people buy and sell goods. The prices of those goods can go up or down depending on factors like supply and demand. In the case of a stock market, the goods being sold are stocks (also called equities or securities), which represent an ownership stake in a corporation.

We've spent a lot of time so far looking at this data, but what does it actually represent? It turns out the stock market is a lot like any other market: people buy and sell goods. The prices of those goods can go up or down depending on factors like supply and demand. In the case of a stock market, the goods being sold are stocks (also called equities or securities), which represent an ownership stake in a corporation.

During each trading day, the price of a stock will change, so when we're looking at whether a stock might be a good investment, we look at four types of numbers: open, high, low, close, volume. Open is exactly what it sounds like: the selling price of a share when the market opens for the day. Similarly, close is the selling price of a share when the market closes at the end of the day, and high and low are the respective maximum and minimum prices of a share over the course of the day. Volume is the number of shares of a given stock that have been bought and sold that day. Generally speaking, a firm whose shares have seen a high volume of trading will see more price variation of the course of the day than a firm whose shares have been more lightly traded.

Let's visualize how the price of Ambuja Cement changes over the last decade.

[63]:
xxxxxxxxxx
 
VimeoVideo("764772582", h="c2b9c56782", width=600)
[63]:
xxxxxxxxxx
**Task 8.2.17:** Plot the closing price of `df_ambuja`. Be sure to label your axes and include a legend.

Task 8.2.17: Plot the closing price of df_ambuja. Be sure to label your axes and include a legend.

  • Make a line plot with time series data in pandas.
[65]:
xxxxxxxxxx
 
fig, ax = plt.subplots(figsize=(15, 6))
# Plot `df_ambuja` closing price
df_ambuja["close"].plot(ax=ax, label='AMBUJACEM', color="C1")
​
# Label axes
plt.xlabel("Date")
plt.ylabel("Closing Price")
​
​
# Add legend
plt.legend()
[65]:
<matplotlib.legend.Legend at 0x7fe3c1ef31c0>
xxxxxxxxxx
Let's add the closing price of Suzlon to our graph so we can compare the two.

Let's add the closing price of Suzlon to our graph so we can compare the two.

[66]:
xxxxxxxxxx
 
VimeoVideo("764772560", h="cabe95603f", width=600)
[66]:
xxxxxxxxxx
**Task 8.2.18:** Create a plot that shows the closing prices of `df_suzlon` and `df_ambuja`. Again, label your axes and include a legend.

Task 8.2.18: Create a plot that shows the closing prices of df_suzlon and df_ambuja. Again, label your axes and include a legend.

  • Make a line plot with time series data in pandas.
[69]:
xxxxxxxxxx
 
fig, ax = plt.subplots(figsize=(15, 6))
# Plot `df_suzlon` and `df_ambuja`
df_ambuja["close"].plot(ax=ax, label='AMBUJACEM', color="C1")
df_suzlon["close"].plot(ax=ax, label='SUZIAN')
# Label axes
plt.xlabel("Date")
plt.ylabel("Closing Price")
​
​
# Add legend
plt.legend()
​
[69]:
<matplotlib.legend.Legend at 0x7fe3c1f5f3a0>
xxxxxxxxxx
Looking at this plot, we might conclude that Ambuja Cement is a "better" stock than Suzlon energy because its price is higher. But price is just one factor that an investor must consider when creating an investment strategy. What is definitely true is that it's hard to do a head-to-head comparison of these two stocks because there's such a large price difference.

Looking at this plot, we might conclude that Ambuja Cement is a "better" stock than Suzlon energy because its price is higher. But price is just one factor that an investor must consider when creating an investment strategy. What is definitely true is that it's hard to do a head-to-head comparison of these two stocks because there's such a large price difference.

One way in which investors compare stocks is by looking at their returns instead. A return is the change in value in an investment, represented as a percentage. So let's look at the daily returns for our two stocks.

[70]:
xxxxxxxxxx
 
VimeoVideo("764772521", h="48fb7816c9", width=600)
[70]:
xxxxxxxxxx
**Task 8.2.19:** Add a `"return"` column to `df_ambuja` that shows the percentage change in the `"close"` column from one day to the next.

Task 8.2.19: Add a "return" column to df_ambuja that shows the percentage change in the "close" column from one day to the next.

  • Calculate the percentage change of a column using pandas.
  • Create new columns derived from existing columns in a DataFrame using pandas.
xxxxxxxxxx
<div class="alert alert-info" role="alert">

Tip: Our two DataFrames are sorted descending by date, but you'll need to make sure they're sorted ascending in order to calculate their returns.

[73]:
xxxxxxxxxx
 
# Sort DataFrame ascending by date
df_ambuja.sort_index(inplace=True)
​
# Create "return" column
df_ambuja["return"] = df_ambuja["close"].pct_change() * 100
​
​
print("df_ambuja shape:", df_ambuja.shape)
print(df_ambuja.info())
df_ambuja.head()
df_ambuja shape: (2500, 6)
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 2500 entries, 2012-12-03 to 2023-01-24
Data columns (total 6 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   open    2500 non-null   float64
 1   high    2500 non-null   float64
 2   low     2500 non-null   float64
 3   close   2500 non-null   float64
 4   volume  2500 non-null   float64
 5   return  2499 non-null   float64
dtypes: float64(6)
memory usage: 136.7 KB
None
[73]:
open high low close volume return
date
2012-12-03 205.15 211.90 205.15 209.65 100947.0 NaN
2012-12-04 210.00 211.45 204.70 205.75 129566.0 -1.860243
2012-12-05 207.00 208.30 205.50 207.15 105079.0 0.680437
2012-12-06 207.50 209.00 203.60 206.65 194948.0 -0.241371
2012-12-07 206.00 209.70 205.05 206.25 101636.0 -0.193564
[74]:
xxxxxxxxxx
 
VimeoVideo("764772505", h="0d303013a8", width=600)
[74]:
xxxxxxxxxx
**Task 8.2.20:** Add a `"return"` column to `df_suzlon`.

Task 8.2.20: Add a "return" column to df_suzlon.

  • Calculate the percentage change of a column using pandas.
  • Create new columns derived from existing columns in a DataFrame using pandas.
[75]:
xxxxxxxxxx
 
# Sort DataFrame ascending by date
df_suzlon.sort_index(inplace=True)
​
# Create "return" column
df_suzlon["return"] = df_suzlon["close"].pct_change() * 100
​
print("df_suzlon shape:", df_suzlon.shape)
print(df_suzlon.info())
df_suzlon.head()
df_suzlon shape: (2500, 6)
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 2500 entries, 2012-12-04 to 2023-01-25
Data columns (total 6 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   open    2500 non-null   float64
 1   high    2500 non-null   float64
 2   low     2500 non-null   float64
 3   close   2500 non-null   float64
 4   volume  2500 non-null   float64
 5   return  2499 non-null   float64
dtypes: float64(6)
memory usage: 136.7 KB
None
[75]:
open high low close volume return
date
2012-12-04 19.55 19.60 18.65 18.80 6882221.0 NaN
2012-12-05 18.65 19.55 18.60 18.95 7595425.0 0.797872
2012-12-06 19.30 19.35 18.75 19.05 3557626.0 0.527704
2012-12-07 19.05 19.35 18.60 18.70 4116932.0 -1.837270
2012-12-10 18.70 19.15 18.60 18.70 2267592.0 0.000000
[76]:
xxxxxxxxxx
 
wqet_grader.grade("Project 8 Assessment", "Task 8.2.20", df_suzlon)

Good work!

Score: 1

xxxxxxxxxx
Now let's plot the returns for our two companies and see how the two compare.

Now let's plot the returns for our two companies and see how the two compare.

[77]:
xxxxxxxxxx
 
VimeoVideo("764772480", h="b8ebd6bd2f", width=600)
[77]:
xxxxxxxxxx
**Task 8.2.21:** Plot the returns for `df_suzlon` and `df_ambuja`. Be sure to label your axes and use legend.

Task 8.2.21: Plot the returns for df_suzlon and df_ambuja. Be sure to label your axes and use legend.

  • Make a line plot with time series data in pandas.
[79]:
xxxxxxxxxx
 
fig, ax = plt.subplots(figsize=(15, 6))
# Plot `df_suzlon` and `df_ambuja`
df_suzlon["return"].plot(ax=ax, label='SUZIAN')
df_ambuja["return"].plot(ax=ax, label='AMBUJACEM', color="C1")
​
# Label axes
plt.xlabel("Date")
plt.ylabel("Return Price")
​
​
# Add legend
plt.legend()
[79]:
<matplotlib.legend.Legend at 0x7fe3c2df7670>
xxxxxxxxxx
Success! By representing returns as a percentage, we're able to compare two stocks that have very different prices. But what is this visualization telling us? We can see that the returns for Suzlon have a wider spread. We see big gains and big losses. In contrast, the spread for Ambuja is narrower, meaning that the price doesn't fluctuate as much. 

Success! By representing returns as a percentage, we're able to compare two stocks that have very different prices. But what is this visualization telling us? We can see that the returns for Suzlon have a wider spread. We see big gains and big losses. In contrast, the spread for Ambuja is narrower, meaning that the price doesn't fluctuate as much.

Another name for this day-to-day fluctuation in returns is called volatility, which is another important factor for investors. So in the next lesson, we'll learn more about volatility and then build a time series model to predict it.

xxxxxxxxxx
---

Copyright 2022 WorldQuant University. This content is licensed solely for personal use. Redistribution or publication of this material is strictly prohibited.

xxxxxxxxxx
​

Usage Guidelines

This lesson is part of the DS Lab core curriculum. For that reason, this notebook can only be used on your WQU virtual machine.

This means:

  • ⓧ No downloading this notebook.
  • ⓧ No re-sharing of this notebook with friends or colleagues.
  • ⓧ No downloading the embedded videos in this notebook.
  • ⓧ No re-sharing embedded videos with friends or colleagues.
  • ⓧ No adding this notebook to public or private repositories.
  • ⓧ No uploading this notebook (or screenshots of it) to other websites, including websites for study resources.

xxxxxxxxxx
<font size="+3"><strong>8.3. Predicting Volatility</strong></font>

8.3. Predicting Volatility

xxxxxxxxxx
In the last lesson, we learned that one characteristic of stocks that's important to investors is **volatility**. Actually, it's so important that there are several time series models for predicting it. In this lesson, we'll build one such model called **GARCH**. We'll also continue working with assert statements to test our code. 

In the last lesson, we learned that one characteristic of stocks that's important to investors is volatility. Actually, it's so important that there are several time series models for predicting it. In this lesson, we'll build one such model called GARCH. We'll also continue working with assert statements to test our code.

[2]:
 
import sqlite3
​
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import wqet_grader
from arch import arch_model
from config import settings
from data import SQLRepository
from IPython.display import VimeoVideo
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
​
wqet_grader.init("Project 8 Assessment")
[3]:
 
VimeoVideo("770039650", h="c39b4b0c08", width=600)
[3]:
xxxxxxxxxx
# Prepare Data

Prepare Data¶

As always, the first thing we need to do is connect to our data source.

xxxxxxxxxx
## Import 

Import¶

[4]:
 
VimeoVideo("770039537", h="a20af766cc", width=600)
[4]:
xxxxxxxxxx
**Task 8.3.1:** Create a connection to your database and then instantiate a `SQLRepository` named `repo` to interact with that database.

Task 8.3.1: Create a connection to your database and then instantiate a SQLRepository named repo to interact with that database.

xxxxxxxxxx
- [Open a connection to a SQL database using sqlite3.](../%40textbook/10-databases-sql.ipynb#sqlite3)
  • Open a connection to a SQL database using sqlite3.
[5]:
 
connection = sqlite3.connect(settings.db_name, check_same_thread = False)
repo =SQLRepository(connection=connection)
​
print("repo type:", type(repo))
print("repo.connection type:", type(repo.connection))
repo type: <class 'data.SQLRepository'>
repo.connection type: <class 'sqlite3.Connection'>
xxxxxxxxxx
Now that we're connected to a database, let's pull out what we need.

Now that we're connected to a database, let's pull out what we need.

[6]:
 
VimeoVideo("770039513", h="74530cf5b8", width=600)
[6]:
xxxxxxxxxx
**Task 8.3.2:** Pull the most recent 2,500 rows of data for Ambuja Cement from your database. Assign the results to the variable `df_ambuja`.

Task 8.3.2: Pull the most recent 2,500 rows of data for Ambuja Cement from your database. Assign the results to the variable df_ambuja.

  • Inspect a DataFrame using shape, info, and head in pandas.
[7]:
 
df_ambuja = repo.read_table(table_name = "AMBUJACEM.BSE", limit=2500)
​
print("df_ambuja type:", type(df_ambuja))
print("df_ambuja shape:", df_ambuja.shape)
df_ambuja.head()
df_ambuja type: <class 'pandas.core.frame.DataFrame'>
df_ambuja shape: (2500, 5)
[7]:
open high low close volume
date
2023-01-24 501.20 508.55 497.55 498.55 100346.0
2023-01-23 517.40 518.45 498.55 500.90 126483.0
2023-01-20 519.05 522.70 515.30 517.20 55838.0
2023-01-19 518.50 525.40 517.30 519.00 82121.0
2023-01-18 516.65 522.00 513.00 519.90 82300.0
xxxxxxxxxx
To train our model, the only data we need are the daily returns for `"AMBUJACEM.BSE"`. We learned how to calculate returns in the last lesson, but now let's formalize that process with a wrangle function.

To train our model, the only data we need are the daily returns for "AMBUJACEM.BSE". We learned how to calculate returns in the last lesson, but now let's formalize that process with a wrangle function.

[8]:
 
VimeoVideo("770039434", h="4fdcd5ffcb", width=600)
[8]:
xxxxxxxxxx
**Task 8.3.3:** Create a `wrangle_data` function whose output is the returns for a stock stored in your database. Use the docstring as a guide and the assert statements in the following code block to test your function. 

Task 8.3.3: Create a wrangle_data function whose output is the returns for a stock stored in your database. Use the docstring as a guide and the assert statements in the following code block to test your function.

  • What's a function?
  • Write a function in Python.
[9]:
 
def wrangle_data(ticker, n_observations):
​
    """Extract table data from database. Calculate returns.
​
    Parameters
    ----------
    ticker : str
        The ticker symbol of the stock (also table name in database).
​
    n_observations : int
        Number of observations to return.
​
    Returns
    -------
    pd.Series
        Name will be `"return"`. There will be no `NaN` values.
    """
    # Get table from database
    df = repo.read_table(table_name = ticker, limit=n_observations+1)
​
    # Sort DataFrame ascending by date
    df.sort_index(inplace =True)
​
    # Create "return" column
    df["return"] = df["close"].pct_change() * 100
​
    # Return returns
    return df["return"].dropna()
xxxxxxxxxx
When you run the cell below to test your function, you'll also create a Series `y_ambuja` that we'll use to train our model.

When you run the cell below to test your function, you'll also create a Series y_ambuja that we'll use to train our model.

[10]:
xxxxxxxxxx
 
y_ambuja = wrangle_data(ticker="AMBUJACEM.BSE", n_observations=2500)
​
# Is `y_ambuja` a Series?
assert isinstance(y_ambuja, pd.Series)
​
# Are there 2500 observations in the Series?
assert len(y_ambuja) == 2500
​
# Is `y_ambuja` name "return"?
assert y_ambuja.name == "return"
​
# Does `y_ambuja` have a DatetimeIndex?
assert isinstance(y_ambuja.index, pd.DatetimeIndex)
​
# Is index sorted ascending?
assert all(y_ambuja.index == y_ambuja.sort_index(ascending=True).index)
​
# Are there no `NaN` values?
assert y_ambuja.isnull().sum() == 0
​
y_ambuja.head()
[10]:
date
2012-12-03    0.938854
2012-12-04   -1.860243
2012-12-05    0.680437
2012-12-06   -0.241371
2012-12-07   -0.193564
Name: return, dtype: float64
xxxxxxxxxx
Great work! Now that we've got a wrangle function, let's get the returns for Suzlon Energy, too.

Great work! Now that we've got a wrangle function, let's get the returns for Suzlon Energy, too.

[11]:
xxxxxxxxxx
 
VimeoVideo("770039414", h="8e8317029e", width=600)
[11]:
xxxxxxxxxx
**Task 8.3.4:** Use your `wrangle_data` function to get the returns for the 2,500 most recent trading days of Suzlon Energy. Assign the results to `y_suzlon`.

Task 8.3.4: Use your wrangle_data function to get the returns for the 2,500 most recent trading days of Suzlon Energy. Assign the results to y_suzlon.

  • What's a function?
  • Write a function in Python.
[12]:
xxxxxxxxxx
 
y_suzlon = wrangle_data(ticker="SUZLON.BSE", n_observations=2500)
​
print("y_suzlon type:", type(y_suzlon))
print("y_suzlon shape:", y_suzlon.shape)
y_suzlon.head()
y_suzlon type: <class 'pandas.core.series.Series'>
y_suzlon shape: (2500,)
[12]:
date
2012-12-04   -3.589744
2012-12-05    0.797872
2012-12-06    0.527704
2012-12-07   -1.837270
2012-12-10    0.000000
Name: return, dtype: float64
xxxxxxxxxx
## Explore

Explore¶

xxxxxxxxxx
Let's recreate the volatility time series plot we made in the last lesson so that we have a visual aid to talk about what volatility is.

Let's recreate the volatility time series plot we made in the last lesson so that we have a visual aid to talk about what volatility is.

[13]:
xxxxxxxxxx
 
fig, ax = plt.subplots(figsize=(15, 6))
​
# Plot returns for `df_suzlon` and `df_ambuja`
y_suzlon.plot(ax=ax, label="SUZLON")
y_ambuja.plot(ax=ax, label="AMBUJACEM")
​
# Label axes
plt.xlabel("Date")
plt.ylabel("Return")
​
# Add legend
plt.legend();
xxxxxxxxxx
The above plot shows how returns change over time. This may seem like a totally new concept, but if we visualize them without considering time, things will start to look familiar.

The above plot shows how returns change over time. This may seem like a totally new concept, but if we visualize them without considering time, things will start to look familiar.

[14]:
xxxxxxxxxx
 
VimeoVideo("770039370", h="dde163e45b", width=600)
[14]:
xxxxxxxxxx
**Task 8.3.5:** Create a histogram `y_ambuja` with 25 bins. Be sure to label the x-axis `"Returns"`, the y-axis `"Frequency [count]"`, and use the title `"Distribution of Ambuja Cement Daily Returns"`.

Task 8.3.5: Create a histogram y_ambuja with 25 bins. Be sure to label the x-axis "Returns", the y-axis "Frequency [count]", and use the title "Distribution of Ambuja Cement Daily Returns".

  • What's a histogram?
  • Create a histogram using Matplotlib.
[15]:
xxxxxxxxxx
 
# Create histogram of `y_ambuja`, 25 bins
plt.hist(y_ambuja, bins=25)
​
# Add axis labels
plt.xlabel("Returns")
plt.ylabel("Frequency [count]")
​
​
# Add title
plt.title("Distribution of Ambuja Cement Daily Returns")
[15]:
Text(0.5, 1.0, 'Distribution of Ambuja Cement Daily Returns')
xxxxxxxxxx
This is a familiar shape! It turns out that returns follow an almost normal distribution, centered on `0`. **Volatility** is the measure of the spread of these returns around the mean. In other words, volatility in finance is the same thing at standard deviation in statistics.

This is a familiar shape! It turns out that returns follow an almost normal distribution, centered on 0. Volatility is the measure of the spread of these returns around the mean. In other words, volatility in finance is the same thing at standard deviation in statistics.

Let's start by measuring the daily volatility of our two stocks. Since our data frequency is also daily, this will be exactly the same as calculating the standard deviation.

[16]:
xxxxxxxxxx
 
VimeoVideo("770039332", h="d43d49b8e7", width=600)
[16]:
xxxxxxxxxx
**Task 8.3.6:** Calculate daily volatility for Suzlon and Ambuja, assigning them to the variables `suzlon_daily_volatility` and `ambuja_daily_volatility`, respectively.

Task 8.3.6: Calculate daily volatility for Suzlon and Ambuja, assigning them to the variables suzlon_daily_volatility and ambuja_daily_volatility, respectively.

  • What's volatility?
  • Calculate the volatility for an asset using Python.
[17]:
xxxxxxxxxx
 
suzlon_daily_volatility =  y_suzlon.std()
ambuja_daily_volatility = y_ambuja.std()
​
print("Suzlon Daily Volatility:", suzlon_daily_volatility)
print("Ambuja Daily Volatility:", ambuja_daily_volatility)
Suzlon Daily Volatility: 3.979808410375838
Ambuja Daily Volatility: 1.8911809353492421
xxxxxxxxxx
Looks like Suzlon is more volatile than Ambuja. This reinforces what we saw in our time series plot, where Suzlon returns have a much wider spread.

Looks like Suzlon is more volatile than Ambuja. This reinforces what we saw in our time series plot, where Suzlon returns have a much wider spread.

While daily volatility is useful, investors are also interested in volatility over other time periods — like annual volatility. Keep in mind that a year isn't 365 days for a stock market, though. After excluding weekends and holidays, most markets have only 252 trading days.

So how do we go from daily to annual volatility? The same way we calculated the standard deviation for our multi-day experiment in Project 7!

[18]:
xxxxxxxxxx
 
VimeoVideo("770039290", h="5b8452708a", width=600)
[18]:
xxxxxxxxxx
**Task 8.3.7:** Calculate the annual volatility for Suzlon and Ambuja, assigning the results to `suzlon_annual_volatility` and `ambuja_annual_volatility`, respectively.

Task 8.3.7: Calculate the annual volatility for Suzlon and Ambuja, assigning the results to suzlon_annual_volatility and ambuja_annual_volatility, respectively.

  • What's volatility?
  • Calculate the volatility for an asset using Python.
[19]:
xxxxxxxxxx
 
suzlon_annual_volatility = suzlon_daily_volatility* np.sqrt(252)
ambuja_annual_volatility = ambuja_daily_volatility * np.sqrt(252)
​
print("Suzlon Annual Volatility:", suzlon_annual_volatility)
print("Ambuja Annual Volatility:", ambuja_annual_volatility)
Suzlon Annual Volatility: 63.17749991722655
Ambuja Annual Volatility: 30.021566634963698
xxxxxxxxxx
Again, Suzlon has higher volatility than Ambuja. What do you think it means that the annual volatility is larger than daily?

Again, Suzlon has higher volatility than Ambuja. What do you think it means that the annual volatility is larger than daily?

xxxxxxxxxx
Since we're dealing with time series data, another way to look at volatility is by calculating it using a rolling window. We'll do this the same way we calculated the rolling average for PM 2.5 levels in Project 3. Here, we'll start focusing on Ambuja Cement exclusively.

Since we're dealing with time series data, another way to look at volatility is by calculating it using a rolling window. We'll do this the same way we calculated the rolling average for PM 2.5 levels in Project 3. Here, we'll start focusing on Ambuja Cement exclusively.

[20]:
xxxxxxxxxx
 
VimeoVideo("770039248", h="71064ba910", width=600)
[20]:
xxxxxxxxxx
**Task 8.3.8:** Calculate the rolling volatility for `y_ambuja`, using a 50-day window. Assign the result to `ambuja_rolling_50d_volatility`.

Task 8.3.8: Calculate the rolling volatility for y_ambuja, using a 50-day window. Assign the result to ambuja_rolling_50d_volatility.

  • What's a rolling window?
  • Do a rolling window calculation in pandas.
[21]:
xxxxxxxxxx
 
ambuja_rolling_50d_volatility = y_ambuja.rolling(window=50).std().dropna()
​
print("rolling_50d_volatility type:", type(ambuja_rolling_50d_volatility))
print("rolling_50d_volatility shape:", ambuja_rolling_50d_volatility.shape)
ambuja_rolling_50d_volatility.head()
rolling_50d_volatility type: <class 'pandas.core.series.Series'>
rolling_50d_volatility shape: (2451,)
[21]:
date
2013-02-11    1.693489
2013-02-12    1.686803
2013-02-13    1.672411
2013-02-14    1.669380
2013-02-15    1.674610
Name: return, dtype: float64
xxxxxxxxxx
This time, we'll focus on Ambuja Cement.

This time, we'll focus on Ambuja Cement.

[22]:
xxxxxxxxxx
 
VimeoVideo("770039209", h="8250d0a2d4", width=600)
[22]:
xxxxxxxxxx
**Task 8.3.9:** Create a time series plot showing the daily returns for Ambuja Cement and the 50-day rolling volatility. Be sure to label your axes and include a legend.

Task 8.3.9: Create a time series plot showing the daily returns for Ambuja Cement and the 50-day rolling volatility. Be sure to label your axes and include a legend.

  • Make a line plot with time series data in pandas.
[23]:
xxxxxxxxxx
 
fig, ax = plt.subplots(figsize=(15, 6))
​
# Plot `y_ambuja`
y_ambuja.plot(ax=ax,label="daily_return")
​
# Plot `ambuja_rolling_50d_volatility`
ambuja_rolling_50d_volatility.plot(ax=ax, label="ambuja_rolling_50d_volatility", linewidth=3)
​
# Add x-axis label
plt.xlabel("Date")
​
# Add legend
plt.legend()
​
[23]:
<matplotlib.legend.Legend at 0x7f2ed0c0bb50>
xxxxxxxxxx
Here we can see that volatility goes up when the returns change drastically — either up or down. For instance, we can see a big increase in volatility in May 2020, when there were several days of large negative returns. We can also see volatility go down in August 2022, when there are only small day-to-day changes in returns.

Here we can see that volatility goes up when the returns change drastically — either up or down. For instance, we can see a big increase in volatility in May 2020, when there were several days of large negative returns. We can also see volatility go down in August 2022, when there are only small day-to-day changes in returns.

This plot reveals a problem. We want to use returns to see if high volatility on one day is associated with high volatility on the following day. But high volatility is caused by large changes in returns, which can be either positive or negative. How can we assess negative and positive numbers together without them canceling each other out? One solution is to take the absolute value of the numbers, which is what we do to calculate performance metrics like mean absolute error. The other solution, which is more common in this context, is to square all the values.

[24]:
xxxxxxxxxx
 
VimeoVideo("770039182", h="1c7ee27846", width=600)
[24]:
xxxxxxxxxx
**Task 8.3.10:** Create a time series plot of the squared returns in `y_ambuja`. Don't forget to label your axes.

Task 8.3.10: Create a time series plot of the squared returns in y_ambuja. Don't forget to label your axes.

  • Make a line plot with time series data in pandas.
[25]:
xxxxxxxxxx
 
fig, ax = plt.subplots(figsize=(15, 6))
​
# Plot squared returns
(y_ambuja **2).plot(ax=ax);
​
# Add axis labels
​
​
xxxxxxxxxx
Perfect! Now it's much easier to see that (1) we have periods of high and low volatility, and (2) high volatility days tend to cluster together. This is a perfect situation to use a GARCH model.

Perfect! Now it's much easier to see that (1) we have periods of high and low volatility, and (2) high volatility days tend to cluster together. This is a perfect situation to use a GARCH model.

A GARCH model is sort of like the ARMA model we learned about in Lesson 3.4. It has a p parameter handling correlations at prior time steps and a q parameter for dealing with "shock" events. It also uses the notion of lag. To see how many lags we should have in our model, we should create an ACF and PACF plot — but using the squared returns.

[26]:
xxxxxxxxxx
 
VimeoVideo("770039152", h="74c63d13ac", width=600)
[26]:
xxxxxxxxxx
**Task 8.3.11:** Create an ACF plot of squared returns for Ambuja Cement. Be sure to label your x-axis `"Lag [days]"` and your y-axis `"Correlation Coefficient"`.

Task 8.3.11: Create an ACF plot of squared returns for Ambuja Cement. Be sure to label your x-axis "Lag [days]" and your y-axis "Correlation Coefficient".

  • What's an ACF plot?
  • Create an ACF plot using statsmodels.
[27]:
xxxxxxxxxx
 
fig, ax = plt.subplots(figsize=(15, 6))
​
# Create ACF of squared returns
plot_acf(y_ambuja **2, ax=ax);
​
# Add axis labels
​
​
[28]:
xxxxxxxxxx
 
VimeoVideo("770039126", h="4cfbc287d8", width=600)
[28]:
xxxxxxxxxx
**Task 8.3.12:** Create a PACF plot of squared returns for Ambuja Cement. Be sure to label your x-axis `"Lag [days]"` and your y-axis `"Correlation Coefficient"`.

Task 8.3.12: Create a PACF plot of squared returns for Ambuja Cement. Be sure to label your x-axis "Lag [days]" and your y-axis "Correlation Coefficient".

  • What's a PACF plot?
  • Create a PACF plot using statsmodels.
[29]:
xxxxxxxxxx
 
fig, ax = plt.subplots(figsize=(15, 6))
​
# Create PACF of squared returns
plot_pacf(y_ambuja **2, ax=ax);
​
# Add axis labels
​
​
xxxxxxxxxx
In our PACF, it looks like a lag of 3 would be a good starting point. 

In our PACF, it looks like a lag of 3 would be a good starting point.

Normally, at this point in the model building process, we would split our data into training and test sets, and then set a baseline. Not this time. This is because our model's input and its output are two different measurements. We'll use returns to train our model, but we want it to predict volatility. If we created a test set, it wouldn't give us the "true values" that we'd need to assess our model's performance. So this time, we'll skip right to iterating.

xxxxxxxxxx
## Split

Split¶

xxxxxxxxxx
The last thing we need to do before building our model is to create a training set. Note that we won't create a test set here. Rather, we'll use all of `y_ambuja` to conduct walk-forward validation after we've built our model. 

The last thing we need to do before building our model is to create a training set. Note that we won't create a test set here. Rather, we'll use all of y_ambuja to conduct walk-forward validation after we've built our model.

[30]:
xxxxxxxxxx
 
VimeoVideo("770039107", h="8c9fbe0f4d", width=600)
[30]:
xxxxxxxxxx
**Task 8.3.13:** Create a training set `y_ambuja_train` that contains the first 80% of the observations in `y_ambuja`. 

Task 8.3.13: Create a training set y_ambuja_train that contains the first 80% of the observations in y_ambuja.

[31]:
xxxxxxxxxx
 
cutoff_test = int(len(y_ambuja) * 0.8)
y_ambuja_train = y_ambuja.iloc[:cutoff_test]
​
print("y_ambuja_train type:", type(y_ambuja_train))
print("y_ambuja_train shape:", y_ambuja_train.shape)
y_ambuja_train.tail()
y_ambuja_train type: <class 'pandas.core.series.Series'>
y_ambuja_train shape: (2000,)
[31]:
date
2021-01-12   -0.795854
2021-01-13   -1.138060
2021-01-14    0.717116
2021-01-15   -1.068016
2021-01-18   -3.049242
Name: return, dtype: float64
xxxxxxxxxx
# Build Model

Build Model¶

xxxxxxxxxx
Just like we did the last time we built a model like this, we'll begin by iterating.<span style='color: transparent; font-size:1%'>WQU WorldQuant University Applied Data Science Lab QQQQ</span>

Just like we did the last time we built a model like this, we'll begin by iterating.WQU WorldQuant University Applied Data Science Lab QQQQ

xxxxxxxxxx
## Iterate

Iterate¶

[32]:
xxxxxxxxxx
 
VimeoVideo("770039693", h="f06bf81928", width=600)
[32]:
[33]:
xxxxxxxxxx
 
VimeoVideo("770039053", h="beaf7753d4", width=600)
[33]:
xxxxxxxxxx
**Task 8.3.14:** Build and fit a GARCH model using the data in `y_ambuja`. Start with `3` as the value for `p` and `q`. Then use the model summary to assess its performance and try other lags.

Task 8.3.14: Build and fit a GARCH model using the data in y_ambuja. Start with 3 as the value for p and q. Then use the model summary to assess its performance and try other lags.

  • What's a GARCH model?
  • What's AIC?
  • What's BIC?
  • Build a GARCH model using arch.
[34]:
xxxxxxxxxx
 
# Build and train model
model = arch_model(
    y_ambuja_train,
    p=1,
    q=1,
    rescale=False
).fit(disp=0)
print("model type:", type(model))
​
# Show model summary
model.summary()
model type: <class 'arch.univariate.base.ARCHModelResult'>
[34]:
Constant Mean - GARCH Model Results
Dep. Variable: return R-squared: 0.000
Mean Model: Constant Mean Adj. R-squared: 0.000
Vol Model: GARCH Log-Likelihood: -4008.57
Distribution: Normal AIC: 8025.13
Method: Maximum Likelihood BIC: 8047.54
No. Observations: 2000
Date: Thu, Jan 26 2023 Df Residuals: 1999
Time: 10:58:17 Df Model: 1
Mean Model
coef std err t P>|t| 95.0% Conf. Int.
mu 0.0488 3.905e-02 1.248 0.212 [-2.779e-02, 0.125]
Volatility Model
coef std err t P>|t| 95.0% Conf. Int.
omega 0.1785 6.948e-02 2.570 1.018e-02 [4.235e-02, 0.315]
alpha[1] 0.0705 1.844e-02 3.821 1.327e-04 [3.432e-02, 0.107]
beta[1] 0.8783 3.296e-02 26.651 1.750e-156 [ 0.814, 0.943]


Covariance estimator: robust
xxxxxxxxxx
<div class="alert alert-info" role="alert">

Tip: You access the AIC and BIC scores programmatically. Every ARCHModelResult has an .aic and a .bic attribute. Try it for yourself: enter model.aic or model.bic

xxxxxxxxxx
Now that we've settled on a model, let's visualize its predictions, together with the Ambuja returns.

Now that we've settled on a model, let's visualize its predictions, together with the Ambuja returns.

[35]:
xxxxxxxxxx
 
VimeoVideo("770039014", h="5e41551d9f", width=600)
[35]:
xxxxxxxxxx
**Task 8.3.15:** Create a time series plot with the Ambuja returns and the conditional volatility for your `model`. Be sure to include axis labels and add a legend.

Task 8.3.15: Create a time series plot with the Ambuja returns and the conditional volatility for your model. Be sure to include axis labels and add a legend.

  • Make a line plot with time series data in pandas.
[36]:
xxxxxxxxxx
 
fig, ax = plt.subplots(figsize=(15, 6))
​
# Plot `y_ambuja_train`
y_ambuja_train.plot(ax=ax, label="Ambuja Daily Return")
​
# Plot conditional volatility * 2
(2*model.conditional_volatility).plot(ax=ax, label="2 SD Conditional Volatility", color="C1", linewidth = 3)
​
# Plot conditional volatility * -2
(-2*model.conditional_volatility).rename("").plot(ax=ax, color="C1", linewidth = 3)
​
# Add legend
plt.legend()
[36]:
<matplotlib.legend.Legend at 0x7f2ed0a02490>
xxxxxxxxxx
Visually, our model looks pretty good, but we should examine residuals, just to make sure. In the case of GARCH models, we need to look at the standardized residuals. 

Visually, our model looks pretty good, but we should examine residuals, just to make sure. In the case of GARCH models, we need to look at the standardized residuals.

[37]:
xxxxxxxxxx
 
VimeoVideo("770038994", h="2a13ab49a7", width=600)
[37]:
xxxxxxxxxx
**Task 8.3.16:** Create a time series plot of the standardized residuals for your `model`. Be sure to include axis labels and a legend.

Task 8.3.16: Create a time series plot of the standardized residuals for your model. Be sure to include axis labels and a legend.

  • Make a line plot with time series data in pandas.
  • What are standardized residuals in a GARCH model?
[38]:
xxxxxxxxxx
 
fig, ax = plt.subplots(figsize=(15, 6))
​
# Plot standardized residuals
model.std_resid.plot(ax=ax)
​
# Add axis labels
​
​
​
# Add legend
​
[38]:
<AxesSubplot:xlabel='date'>
xxxxxxxxxx
These residuals look good: they have a consistent mean and spread over time. Let's check their normality using a histogram. 

These residuals look good: they have a consistent mean and spread over time. Let's check their normality using a histogram.

[39]:
xxxxxxxxxx
 
VimeoVideo("770038970", h="f76c8f6529", width=600)
[39]:
xxxxxxxxxx
**Task 8.3.17:** Create a histogram with 25 bins of the standardized residuals for your model. Be sure to label your axes and use a title. 

Task 8.3.17: Create a histogram with 25 bins of the standardized residuals for your model. Be sure to label your axes and use a title.

  • What's a histogram?
  • Create a histogram using Matplotlib.
[41]:
xxxxxxxxxx
 
# Create histogram of standardized residuals, 25 bins
plt.hist(model.std_resid, bins=25)
​
# Add axis labels
plt.xlabel("Standarized Residual")
​
​
​
# Add title
​
[41]:
Text(0.5, 0, 'Standarized Residual')
xxxxxxxxxx
Our last visualization will the ACF of standardized residuals. Just like we did with our first ACF, we'll need to square the values here, too. 

Our last visualization will the ACF of standardized residuals. Just like we did with our first ACF, we'll need to square the values here, too.

[42]:
xxxxxxxxxx
 
VimeoVideo("770038952", h="c7a3cfe34f", width=600)
[42]:
xxxxxxxxxx
**Task 8.3.18:** Create an ACF plot of the square of your standardized residuals. Don't forget axis labels!

Task 8.3.18: Create an ACF plot of the square of your standardized residuals. Don't forget axis labels!

  • What's an ACF plot?
  • Create an ACF plot using statsmodels.
[43]:
xxxxxxxxxx
 
fig, ax = plt.subplots(figsize=(15, 6))
​
# Create ACF of squared, standardized residuals
plot_acf(model.std_resid **2, ax=ax)
​
# Add axis labels
​
​
[43]:
xxxxxxxxxx
Excellent! Looks like this model is ready for a final evaluation.

Excellent! Looks like this model is ready for a final evaluation.

xxxxxxxxxx
## Evaluate

Evaluate¶

xxxxxxxxxx
To evaluate our model, we'll do walk-forward validation. Before we do, let's take a look at how this model returns its predictions.

To evaluate our model, we'll do walk-forward validation. Before we do, let's take a look at how this model returns its predictions.

[44]:
xxxxxxxxxx
 
VimeoVideo("770038921", h="f74869b8fc", width=600)
[44]:
xxxxxxxxxx
**Task 8.3.19:** Create a one-day forecast from your `model` and assign the result to the variable `one_day_forecast`. 

Task 8.3.19: Create a one-day forecast from your model and assign the result to the variable one_day_forecast.

  • What's variance?
  • Generate a forecast for a model using arch.
[49]:
xxxxxxxxxx
 
one_day_forecast = model.forecast(horizon=1, reindex=False).variance.iloc[0,0] ** 0.5
​
print("one_day_forecast type:", type(one_day_forecast))
one_day_forecast
one_day_forecast type: <class 'numpy.float64'>
[49]:
1.8485320751360448
xxxxxxxxxx
There are two things we need to keep in mind here. First, our `model` forecast shows the predicted **variance**, not the **standard deviation** / **volatility**. So we'll need to take the square root of the value. Second, the prediction is in the form of a DataFrame. It has a DatetimeIndex, and the date is the last day for which we have training data. The `"h.1"` column stands for "horizon 1", that is, our model's prediction for the following day. We'll have to keep all this in mind when we reformat this prediction to serve to the end user of our application.

There are two things we need to keep in mind here. First, our model forecast shows the predicted variance, not the standard deviation / volatility. So we'll need to take the square root of the value. Second, the prediction is in the form of a DataFrame. It has a DatetimeIndex, and the date is the last day for which we have training data. The "h.1" column stands for "horizon 1", that is, our model's prediction for the following day. We'll have to keep all this in mind when we reformat this prediction to serve to the end user of our application.

[50]:
xxxxxxxxxx
 
VimeoVideo("770038861", h="10efe8c445", width=600)
[50]:
xxxxxxxxxx
**Task 8.3.20:** Complete the code below to do walk-forward validation on your `model`. Then run the following code block to visualize the model's test predictions.

Task 8.3.20: Complete the code below to do walk-forward validation on your model. Then run the following code block to visualize the model's test predictions.

  • What's walk-forward validation?
  • Perform walk-forward validation for time series model.
[54]:
xxxxxxxxxx
 
# Create empty list to hold predictions
predictions = []
​
# Calculate size of test data (20%)
test_size = int(len(y_ambuja) * 0.2)
​
# Walk forward
for i in range(test_size):
    # Create test data
    y_train = y_ambuja.iloc[: -(test_size - i)]
​
    # Train model
    model = arch_model(y_train,p=1,q=1,rescale=False).fit(disp=0)
​
    # Generate next prediction (volatility, not variance)
    next_pred = model.forecast(horizon=1, reindex=False).variance.iloc[0,0] ** 0.5
​
    # Append prediction to list
    predictions.append(next_pred)
​
# Create Series from predictions list
y_test_wfv = pd.Series(predictions, index=y_ambuja.tail(test_size).index)
​
print("y_test_wfv type:", type(y_test_wfv))
print("y_test_wfv shape:", y_test_wfv.shape)
y_test_wfv.head()
y_test_wfv type: <class 'pandas.core.series.Series'>
y_test_wfv shape: (500,)
[54]:
date
2021-01-19    1.848532
2021-01-20    1.879677
2021-01-21    1.811127
2021-01-22    2.004459
2021-01-25    2.004325
dtype: float64
[55]:
xxxxxxxxxx
 
fig, ax = plt.subplots(figsize=(15, 6))
​
# Plot returns for test data
y_ambuja.tail(test_size).plot(ax=ax, label="Ambuja Return")
​
# Plot volatility predictions * 2
(2 * y_test_wfv).plot(ax=ax, c="C1", label="2 SD Predicted Volatility")
​
# Plot volatility predictions * -2
(-2 * y_test_wfv).plot(ax=ax, c="C1")
​
# Label axes
plt.xlabel("Date")
plt.ylabel("Return")
​
# Add legend
plt.legend();
xxxxxxxxxx
This looks pretty good. Our volatility predictions seem to follow the changes in returns over time. This is especially clear in the low-volatility period in the summer of 2022 and the high-volatility period in fall 2022.

This looks pretty good. Our volatility predictions seem to follow the changes in returns over time. This is especially clear in the low-volatility period in the summer of 2022 and the high-volatility period in fall 2022.

One additional step we could do to evaluate how our model performs on the test data would be to plot the ACF of the standardized residuals for only the test set. But you can do that step on your own.

xxxxxxxxxx
# Communicate Results

Communicate Results¶

xxxxxxxxxx
Normally in this section, we create visualizations for a human audience, but our goal for *this* project is to create an API for a *computer* audience. So we'll focus on transforming our model's predictions to JSON format, which is what we'll use to send predictions in our application. 

Normally in this section, we create visualizations for a human audience, but our goal for this project is to create an API for a computer audience. So we'll focus on transforming our model's predictions to JSON format, which is what we'll use to send predictions in our application.

The first thing we need to do is create a DatetimeIndex for our predictions. Using labels like "h.1", "h.2", etc., won't work. But there are two things we need to keep in mind. First, we can't include dates that are weekends because no trading happens on those days. And we'll need to write our dates using strings that follow the ISO 8601 standard.

[51]:
xxxxxxxxxx
 
VimeoVideo("770038804", h="8976257596", width=600)
[51]:
xxxxxxxxxx
**Task 8.3.21:** Below is a `prediction`, which contains a 5-day forecast from our `model`. Using it as a starting point, create a `prediction_index`. This should be a list with the following 5 dates written in ISO 8601 format.

Task 8.3.21: Below is a prediction, which contains a 5-day forecast from our model. Using it as a starting point, create a prediction_index. This should be a list with the following 5 dates written in ISO 8601 format.

  • Create a fixed frequency DatetimeIndex in pandas.
  • Transform a Timestamp to ISO 8601 format in pandas.
[59]:
xxxxxxxxxx
 
# Generate 5-day volatility forecast
prediction = model.forecast(horizon=5, reindex=False).variance ** 0.5
print(prediction)
​
# Calculate forecast start date
start = prediction.index[0] + pd.DateOffset(days=1)
​
# Create date range
prediction_dates = pd.bdate_range(start=start, periods=prediction.shape[1])
​
# Create prediction index labels, ISO 8601 format
prediction_index = [d.isoformat() for d in prediction_dates]
​
print("prediction_index type:", type(prediction_index))
print("prediction_index len:", len(prediction_index))
prediction_index[:3]
                 h.1       h.2       h.3       h.4       h.5
date                                                        
2023-01-23  1.773645  1.780819  1.787626  1.794086  1.800218
prediction_index type: <class 'list'>
prediction_index len: 5
[59]:
['2023-01-24T00:00:00', '2023-01-25T00:00:00', '2023-01-26T00:00:00']
xxxxxxxxxx
Now that we know how to create the index, let's create a function to combine the index and predictions, and then return a dictionary where each key is a date and each value is a predicted volatility. 

Now that we know how to create the index, let's create a function to combine the index and predictions, and then return a dictionary where each key is a date and each value is a predicted volatility.

[52]:
xxxxxxxxxx
 
VimeoVideo("770039565", h="d419d0a78d", width=600)
[52]:
xxxxxxxxxx
**Task 8.3.22:** Create a `clean_prediction` function. It should take a variance prediction DataFrame as input and return a dictionary where each key is a date in ISO 8601 format and each value is the predicted volatility. Use the docstring as a guide and the assert statements to test your function. When you're satisfied with the result, submit it to the grader.

Task 8.3.22: Create a clean_prediction function. It should take a variance prediction DataFrame as input and return a dictionary where each key is a date in ISO 8601 format and each value is the predicted volatility. Use the docstring as a guide and the assert statements to test your function. When you're satisfied with the result, submit it to the grader.

  • What's a function?
  • Write a function in Python.
[60]:
xxxxxxxxxx
 
def clean_prediction(prediction):
​
    """Reformat model prediction to JSON.
​
    Parameters
    ----------
    prediction : pd.DataFrame
        Variance from a `ARCHModelForecast`
​
    Returns
    -------
    dict
        Forecast of volatility. Each key is date in ISO 8601 format.
        Each value is predicted volatility.
    """
    # Calculate forecast start date
    start = prediction.index[0] + pd.DateOffset(days=1)
​
    # Create date range
    prediction_dates = pd.bdate_range(start=start, periods=prediction.shape[1])
​
    # Create prediction index labels, ISO 8601 format
    prediction_index = [d.isoformat() for d in prediction_dates]
​
    # Extract predictions from DataFrame, get square root
    data = prediction.values.flatten() ** 0.5
​
    # Combine `data` and `prediction_index` into Series
    prediction_formated = pd.Series(data, index = prediction_index)
​
    # Return Series as dictionary
    return prediction_formated.to_dict()
[61]:
xxxxxxxxxx
 
prediction = model.forecast(horizon=10, reindex=False).variance
prediction_formatted = clean_prediction(prediction)
​
# Is `prediction_formatted` a dictionary?
assert isinstance(prediction_formatted, dict)
​
# Are keys correct data type?
assert all(isinstance(k, str) for k in prediction_formatted.keys())
​
# Are values correct data type
assert all(isinstance(v, float) for v in prediction_formatted.values())
​
prediction_formatted
[61]:
{'2023-01-24T00:00:00': 1.7736452768458828,
 '2023-01-25T00:00:00': 1.780818750383704,
 '2023-01-26T00:00:00': 1.787625557186955,
 '2023-01-27T00:00:00': 1.794085824430055,
 '2023-01-30T00:00:00': 1.80021842893306,
 '2023-01-31T00:00:00': 1.8060410904036288,
 '2023-02-01T00:00:00': 1.811570456053118,
 '2023-02-02T00:00:00': 1.8168221775650162,
 '2023-02-03T00:00:00': 1.8218109812627508,
 '2023-02-06T00:00:00': 1.8265507322128498}
[62]:
xxxxxxxxxx
 
wqet_grader.grade("Project 8 Assessment", "Task 8.3.21", prediction_formatted)

🥳

Score: 1

xxxxxxxxxx
Great work! We now have several components for our application: classes for getting data from an API, classes for storing it in a database, and code for building our model and cleaning our predictions. The next step is creating a class for our model and paths for application — both of which we'll do in the next lesson.

Great work! We now have several components for our application: classes for getting data from an API, classes for storing it in a database, and code for building our model and cleaning our predictions. The next step is creating a class for our model and paths for application — both of which we'll do in the next lesson.

xxxxxxxxxx
---

Copyright 2022 WorldQuant University. This content is licensed solely for personal use. Redistribution or publication of this material is strictly prohibited.

[104]:
xxxxxxxxxx
wqet_grader.grade("Project 8 Assessment", "Task 8.5.20", submission_8520)
Advanced Tools
xxxxxxxxxx
xxxxxxxxxx

-

Variables

Callstack

    Breakpoints

    Source

    xxxxxxxxxx
    1
      1
      4
      Python
      Python 3 (ipykernel) | Idle
      Saving completed
      Uploading…
      085-assignment.ipynb
      English (United States)
      Spaces: 4
      Ln 1, Col 1
      Mode: Command
      • iy_mtnoyinstance
      • y_mtnoy_train
      • Console
      • Change Kernel…
      • Clear Console Cells
      • Close and Shut Down…
      • Insert Line Break
      • Interrupt Kernel
      • New Console
      • Restart Kernel…
      • Run Cell (forced)
      • Run Cell (unforced)
      • Show All Kernel Activity
      • Debugger
      • Continue
        Continue
        F9
      • Evaluate Code
        Evaluate Code
      • Next
        Next
        F10
      • Step In
        Step In
        F11
      • Step Out
        Step Out
        Shift+F11
      • Terminate
        Terminate
        Shift+F9
      • Extension Manager
      • Enable Extension Manager
      • File Operations
      • Autosave Documents
      • Open from Path…
        Open from path
      • Reload Notebook from Disk
        Reload contents from disk
      • Revert Notebook to Checkpoint
        Revert contents to previous checkpoint
      • Save Notebook
        Save and create checkpoint
        Ctrl+S
      • Save Notebook As…
        Save with new path
        Ctrl+Shift+S
      • Show Active File in File Browser
      • Trust HTML File
      • Help
      • About JupyterLab
      • Jupyter Forum
      • Jupyter Reference
      • JupyterLab FAQ
      • JupyterLab Reference
      • Launch Classic Notebook
      • Licenses
      • Markdown Reference
      • Reset Application State
      • Image Viewer
      • Flip image horizontally
        H
      • Flip image vertically
        V
      • Invert Colors
        I
      • Reset Image
        0
      • Rotate Clockwise
        ]
      • Rotate Counterclockwise
        [
      • Zoom In
        =
      • Zoom Out
        -
      • Kernel Operations
      • Shut Down All Kernels…
      • Launcher
      • New Launcher
      • Main Area
      • Activate Next Tab
        Ctrl+Shift+]
      • Activate Next Tab Bar
        Ctrl+Shift+.
      • Activate Previous Tab
        Ctrl+Shift+[
      • Activate Previous Tab Bar
        Ctrl+Shift+,
      • Activate Previously Used Tab
        Ctrl+Shift+'
      • Close All Other Tabs
      • Close All Tabs
      • Close Tab
        Alt+W
      • Close Tabs to Right
      • Find Next
        Ctrl+G
      • Find Previous
        Ctrl+Shift+G
      • Find…
        Ctrl+F
      • Log Out
        Log out of JupyterLab
      • Presentation Mode
      • Show Header Above Content
      • Show Left Sidebar
        Ctrl+B
      • Show Log Console
      • Show Right Sidebar
      • Show Status Bar
      • Shut Down
        Shut down JupyterLab
      • Simple Interface
        Ctrl+Shift+D
      • Notebook Cell Operations
      • Change to Code Cell Type
        Y
      • Change to Heading 1
        1
      • Change to Heading 2
        2
      • Change to Heading 3
        3
      • Change to Heading 4
        4
      • Change to Heading 5
        5
      • Change to Heading 6
        6
      • Change to Markdown Cell Type
        M
      • Change to Raw Cell Type
        R
      • Clear Outputs
      • Collapse All Code
      • Collapse All Outputs
      • Collapse Selected Code
      • Collapse Selected Outputs
      • Copy Cells
        C
      • Cut Cells
        X
      • Delete Cells
        D, D
      • Disable Scrolling for Outputs
      • Enable Scrolling for Outputs
      • Expand All Code
      • Expand All Outputs
      • Expand Selected Code
      • Expand Selected Outputs
      • Extend Selection Above
        Shift+K
      • Extend Selection Below
        Shift+J
      • Extend Selection to Bottom
        Shift+End
      • Extend Selection to Top
        Shift+Home
      • Insert Cell Above
        A
      • Insert Cell Below
        B
      • Merge Cell Above
        Ctrl+Backspace
      • Merge Cell Below
        Ctrl+Shift+M
      • Merge Selected Cells
        Shift+M
      • Move Cells Down
      • Move Cells Up
      • Paste Cells Above
      • Paste Cells and Replace
      • Paste Cells Below
        V
      • Redo Cell Operation
        Shift+Z
      • Run Selected Cells
        Shift+Enter
      • Run Selected Cells and Don't Advance
        Ctrl+Enter
      • Run Selected Cells and Insert Below
        Alt+Enter
      • Run Selected Text or Current Line in Console
      • Select Cell Above
        K
      • Select Cell Below
        J
      • Split Cell
        Ctrl+Shift+-
      • Undo Cell Operation
        Z
      • Notebook Operations
      • Change Kernel…
      • Clear All Outputs
      • Close and Shut Down
      • Collapse All Cells
      • Deselect All Cells
      • Enter Command Mode
        Ctrl+M
      • Enter Edit Mode
        Enter
      • Expand All Headings
      • Interrupt Kernel
      • New Console for Notebook
      • New Notebook
        Create a new notebook
      • Reconnect To Kernel
      • Render All Markdown Cells
      • Restart Kernel and Clear All Outputs…
      • Restart Kernel and Run All Cells…
      • Restart Kernel and Run up to Selected Cell…
      • Restart Kernel…
      • Run All Above Selected Cell
      • Run All Cells
      • Run Selected Cell and All Below
      • Select All Cells
        Ctrl+A
      • Toggle All Line Numbers
        Shift+L
      • Toggle Collapse Notebook Heading
        T
      • Trust Notebook
      • Settings
      • Advanced Settings Editor
        Ctrl+,
      • Show Contextual Help
      • Show Contextual Help
        Live updating code documentation from the active kernel
        Ctrl+I
      • Spell Checker
      • Choose spellchecker language
      • Toggle spellchecker
      • Terminal
      • Decrease Terminal Font Size
      • Increase Terminal Font Size
      • New Terminal
        Start a new terminal session
      • Refresh Terminal
        Refresh the current terminal session
      • Use Terminal Theme: Dark
        Set the terminal theme
      • Use Terminal Theme: Inherit
        Set the terminal theme
      • Use Terminal Theme: Light
        Set the terminal theme
      • Text Editor
      • Decrease Font Size
      • Increase Font Size
      • Indent with Tab
      • New Markdown File
        Create a new markdown file
      • New Python File
        Create a new Python file
      • New Text File
        Create a new text file
      • Spaces: 1
      • Spaces: 2
      • Spaces: 4
      • Spaces: 8
      • Theme
      • Decrease Code Font Size
      • Decrease Content Font Size
      • Decrease UI Font Size
      • Increase Code Font Size
      • Increase Content Font Size
      • Increase UI Font Size
      • Theme Scrollbars
      • Use Theme: JupyterLab Dark
      • Use Theme: JupyterLab Light